You are on page 1of 71

Compression and decompression

Compression ?

Compression: the process of coding that will effectively reduce the total number of bits needed to represent certain information.

What is compression ?

compression is particularly useful in communications because it enables devices to transmit or store the same amount of data in fewer bits..(Webopedia) Compression is reducing the size of available data. It is achieved by removing redundant data. Adv:
Reduction in size of the file. Increase in transfer speed.


the downside, compressed data must be decompressed to be used, and this extra processing may be problematic to some applications.
For instance, a compression scheme for video may require expensive hardware for the video to be decompressed fast enough to be viewed as it is being decompressed the option of decompressing the video in full before watching it may be inconvenient, and requires storage space for the decompressed video.


How is compression possible?

Redundancy in digital audio, image, and video data Properties of human perception

Digital audio is a series of sample values; Image is a rectangular array of pixel values; Video is a sequence of images played out at a certain rate


Adjacent audio samples are similar (predictive encoding); samples corresponding to silence (silence removal) In digital image, neighboring samples on a scanning line are normally similar (spatial redundancy) In digital video, in addition to spatial redundancy, neighboring images in a video sequence may be similar (temporal redundancy)

Need of compression
In order to manage the large data objects efficiently, these data objects need to be compressed to reduce the file size. Two problems:

Storage Transmission.

Compression algorithms try to eliminate redundancies in the data and thus reduce the storage required.

Need of Why multimedia compression ??? compression

A square inch of 400 dpi image consists of 160,000 dots ( pixels). If each dot representation. (pixel) represent 8 bits of gray level, this information becomes 8 x 160 000 bits. Multimedia data objects


like images, color images, videos, audio, animated imageswhen digitized, large amount of data is generated. An uncompressed data object can be of the Exact amount of data depends on the order of several megabytes. resolution. As the resolution increases, from 200dpi to 400dpi, the size of data increases. These data objects need to be stored data color associated Each pixel is a bit of and transmitted Calculate!


image and video require vast amounts of data:

320x240x8bits grayscale image: 77Kb 1100x900x24bits color image: 3MB 640x480x24x30frames/sec: 27.6 MB/sec


networks bandwidth doesn't allow for real time video transmission Compression reduces storage requirements

Types of compression:Types
Unaffected by loss Affected by loss

1. Packbit encoding (Run length encoding) 2. CCITT Group3 1 D 3. CCITT Group3 2D 4. CCITT Group4 2D


1. Symmetric


2. 3. 4. 5.

JPEG MPEG Intel DVI H.261 Fractal


example of lossless vs. lossy compression is the following string:



string can be compressed as:

25.[9]8 Interpreted as, "twenty five point 9 eights", the original string is perfectly recreated, just written in a smaller form.

Lossy :
26 instead, the exact original data is lost, at the benefit of a shorter representation.


With lossless compression, every single bit of data that was originally in the file remains after the file is uncompressed. All of the information is completely restored. This is generally the technique of choice for text or spreadsheet files, where losing words or financial data could pose a problem. The Graphics Interchange File (GIF) is an image format used on the Web that provides lossless compression. The PKZIP compression technology is an example of lossless compression.
For example, it is used in the ZIP file format and in the Unix tool gzip.

Lossy :

Lossy compression reduces a file by permanently eliminating certain information Lossy compression technologies attempt to eliminate redundant or unnecessary information. When the file is uncompressed, a part of the original information is not there (although the user may not notice it). Lossy compression is generally used for video and sound, where a certain amount of information loss will not be detected by users. The JPEG image file, commonly used for photographs and other complex still images on the Web, is an image that has lossy compression.


image compression is used in digital cameras, to increase storage capacities with minimal degradation of picture quality. Similarly, DVDs use the lossy MPEG-2 Video codec for video compression In lossy audio compression, non-audible (or less audible) components of the signal are removed. Compression of human speech is often performed with even more specialized techniques, so that "speech compressiondistinguished as a separate discipline from "audio compression". Mostly used in image/video compression E.g. JPEG,MPEG,H.261,fractal


Compression :

It takes equal amount of time for compression and decompression Uses near about the same method for comp and decomp E.g. : JPEG


Different amount of time Different technique E.g. : Fractal

RLE (Packbits Encoding):

Run Length Encoding (RLE) is a simple and popular data compression algorithm. It is based on the idea to replace a long sequence of the same symbol by a shorter sequence and is a good introduction into the data compression field for newcomers. RLE also refers to a little-used image format in Windows 3.x, with the extension .rle, which is a Run Length Encoded Bitmap, used to compress the Windows 3.x startup screen.

Run-length encoding is a data compression algorithm that is supported by most bitmap file formats, such as TIFF, BMP, and PCX. RLE is suited for compressing any type of data regardless of its information content, but the content of the data will affect the compression ratio achieved by RLE. RLE works by reducing the physical size of a repeating string of characters. This repeating string, called a run, is typically encoded into two bytes.

The first byte represents the number of characters in the run and is called the run count. The second byte is the value of the character in the run, and is called the run value

AAAAAAAAAAAAAAA The same string after RLE encoding would require only two bytes: 15A The 15A code generated to represent the character string is called an RLE packet. Here, the first byte, 15, is the run count and contains the number of repetitions. The second byte, A, is the run value and contains the actual repeated value in the run.

A new packet is generated each time the run character changes, or each time the number of characters in the run exceeds the maximum count. Assume that our 15-character string now contains four different character runs:

Using run-length encoding this could be compressed into four 2byte packets:
6A3b5X1t Thus, after run-length encoding, the 15-byte string would require only eight bytes of data to represent the string, as opposed to the original 15 bytes. In this case, run-length encoding yielded a compression ratio of almost 2 to 1.

But observe how RLE encoding doubles the size of the following 14-character string:

After RLE becomes:





Run count can even be found in Hexadecimal Format E.g.:

00000001111111111 Becomes: 0X07 0X00 0X0A 0X01

Pros and Cons

RLE schemes are simple and fast, but their compression efficiency depends on the type of data being encoded. A black-and-white image that is mostly white, will encode very well, due to the large amount of contiguous data that is all the same color. An image with many colors that is very busy in appearance, such as a photograph, will not encode very well. This is because the complexity of the image is expressed as a large number of different colors. And because of this complexity there will be relatively few runs of the same color..resulting in more no of packets.


Common formats for run-length encoded data include Truevision TGA, PackBits, PCX and ILBM. Run-length encoding is used in fax machines (combined with other techniques into Modified Huffman coding). It is relatively efficient because most faxed documents are mostly white space, with occasional interruptions of black. Data that have long sequential runs of bytes (such as lower-quality sound samples) can be RLE compressed


CCITT.Consultative Committee International Telephony and Telegraphy. An organization that communications standards. sets



CCITT, now known as ITU (the parent organization, International Telecommunication Union ) has defined many important standards for data communications ITU is based in Geneva, Switzerland.


CCITT Group 3 is the universal protocol for sending fax documents through a phone line. In this group each scan line is encoded independently. CCITT Group 3 1D - a scan line is encoded as a set of runs, each representing a number of white or black pixels Every run is encoded using a different number of bits, which can be uniquely identified when decoded.

This scheme was designed for black and white images only, not for color images. This technique makes use of Huffman Encoding a.k.a. Horizontal Encoding. The Huffman Encoding scheme is based on a coding tree, which is constructed based on the probability of occurrence of white pixels or black pixels in the run length or bit stream. As a result, shorter codes are developed for frequently occurring run lengths and longer codes for less frequent run lengths.

Code for n white pixels No of pixels

Code for n black pixels

Terminating Codes

Table 8.14 (Cont)

Makeup Codes

Codes for 0 to 63 pixels are known as Terminating codes The codes from 64, are multiples of 64 , and are called Makeup codes. Using the combination of makeup and terminating codes, the bit sequence is represented. Run length codes for black pixels are different than that of white pixels.

Nearest Makeup Code + E.g. :Remaining Terminating Run length of 132 white pixels is encoded by the Code
following two codes :
Makeup code for 128 white pixels- 10010 Terminating code for white pixels 1011

So the compressed bit stream of 132 white pixels is : 100101011 Hence, compression ratio = Total no. of bits/No of bits used to code them = 132/9 =14

This means that frequently occurring lengths of run may be encoded very efficiently, at the expense of the infrequent ones.
For example, a black run of 2 or 3 pixels is encoded using just 2 bits, whereas 1000 black pixels are encoded in 25.( 13 for 960 + 12 for 40)

CCITT Group 3 1D File Format:


Simple to implement in both S/w and H/w Worldwide standard for black and white images.hence for fax.


It is horizontalone dimensionalencodes each row of line separately There is very little difference between a scan line and several lines before and after the scan line. This advantage is not utilized by the scheme. No error protection mechanism. Misinterpretation possible. Since , each line of information is a change from the previous one, it is possible to misinterpret one change , causing the rest of the image to reverse the colors.

So came in picture CCITT Group 3 2D

CCITT Group 3 2D

This scheme is commonly used for document imaging systems and for fax. Provides good compression ratio that ranges between 10 and 20. Logic--- many lines differs very little from the lines above or the lines below. Uses k-factor. Image is divided into several groups of k lines.

Reference line




The first line of every group of k-lines , called Reference Line is encoded using CCITT Group 3 1D method. Remaining lines of each group are coded with reference to first line of each group using vertical coding.

This is based on the statistical nature of the images. The image data across the adjacent scan line is redundant Many of these lines have common areas of black pixels and white pixels The information that needs to be stored is only the change in the contour of the image objects.

It uses a combination of additional codes Vertical code Pass code Horizontal code to encode every line in the group of k-lines.

Value of pass code is always 0001 Horizontal code is always

001+Group 3 1D code = 001 + makeup code + terminating code.

There are seven types of vertical code, and the values depend on the position difference between the changing pixel in the reference line and the changing pixel in the coding line. If the black and white transition occurs on a given scan line, chances are the same transition will also occur within plus or minus 3 pixels in the next scan line.

Difference between pixel position in reference line and coding line

Vertical Code

3 2 1 0
-1 -2 -3

0000010 000010 010 1

011 000011 0000011

b0-a0 = 7-6 =1
b0 b1

b1-a1 =13-14 =-1

Reference line

0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0

Coding line

0 0 0 0 0 11 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0

a0 a1 The bx and ax are used for calculating delta values Which give information about the difference between reference line and current line & allows to code the difference. Coding line 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0
After compression
0 1 0 0 1 1 0 0 0 0 00

Vertical code Pass code Horizontal code

After applying the code , a and b pointers move to the new location.
b0 b0

0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 1

Reference line
0 0 0 0 0 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0

Coding line



b0-a0 = 9-2 =7 b0

b1-a1 =13-14 =-1 b1

Reference line

0 0 0 0 0 0 00 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1

Coding line

0 11 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0


HC. P.C. VC.

If the difference between the reference line and the code line goes beyond +3 and -3,then pass code is to be used. After applying pass code, we can have horizontal coding and, continue to perform horizontal coding till EOL is reached or, vertical coding can be continued.


pass code




pass code


pass code


Fast Due to the 2D nature, better compression ratio than 3 1D Error handling.k-groups

Lesser compression compared to 4 2D Complex Relatively difficult to implement in s/w

CCITT Group 4 2D compression:

The compression ratio was not sufficient for serious, high-resolution document imaging. This is a 2D coding scheme without the kfactor..the k-factor in this scheme is the entire page of lines. Here, the first reference line is an imaginary allwhite line above the top of the image. The first group of pixels is encoded using the imaginary white line as reference line. The new coded line becomes the reference line for the next scan line. Each successive line is coded relative to the previous line. This provides very large level of compression.

Imaginary line of white pixels

(i-1) i Single group

There are no EOL markers before the start of the compressed data. Fillers are not used for the scan line . There is an EOP (End-Of-Page) mark consisting of

two concatenated EOLs padding bits are added immediately after the end of compressed data.

CCITT Group 4 2D File Format:

Data Line 1 Data .. Line 2 Data Line Data Line n-1 EOL EOL PAD bits n

Better resolution

Slow Complex As there is no reference line, a single error error can result in the rest of the page being skewed.

Comparison of Group 3 2D with Group 4 2D:

3 2D Entire image is divided into equal sized k groups First line of each group is coded using 3 1D Remaining line of each group coded w.r.t. first line Fast Simple Error handling capability available Lesser compression Makes use of EOL 4 2D Each image is treated as a single group First line of the image is coded w.r.t. an imaginary line of white pixels. i th line is coded w.r.t. (i-1) th line Slow Complex Not error handling mechanism present Max compression, near abt twice to 3 2D Makes use of EOP

The CCITT Group 3 and 4 formats are not suitable for -- color component of an image High resolution graphics. JPEG was invented as a standard for color images.continuous-tone images.

Color images:

Color characteristics:
Luminance :
This is the measure of the light emitted or reflected by an object.

This is the color sensation produced in an observer due to the presence of certain wavelengths of color

Depth of a color Difference between red and pink

What is a Color model ?

A color model is an orderly system for creating a whole range of colors from a small set of primary colors. There are two types of color models,
Subtractive Additive.

Additive color models use light to display color while subtractive models use printing inks. Colors perceived in additive models are the result of transmitted light.... the typical technique on color displays Colors perceived in subtractive models are the result of reflected light. the typical technique in printers/plotters.

Color models:

CMYK model:
The Cyan ,Magenta ,Yellow and Black (CMYK) model is used in color printing devices. It is a color subtractive model.

HSI model(.HSB model)

The Hue, Saturation and Intensity model represents tint, shade and tone. This model is used in IP for filtering and smoothing images. Requires high level of computation.


Developed by NTSC Subtractive model

Black and white representation information

Contains color information U= Red-Cyan V = Magenta-Green

Y Luminance component UV Chrominance component Used in full-motion video.

RGB model:
This model is additive in nature.intensities of Red, Green and Blue are added to generate various colors. Used in design of image capture devices, television, and color monitors.

No color model is better than the other, the choice depends on the application.

RGB Color Model

CMYK Color Model

Additive color model Subtractive color model For computer displays For printed material Uses light to display color Uses ink to display color Colors result from transmitted light Colors result from reflected light Red+Green+Blue=White Cyan+Magenta+Yellow=Black

Data and file format standards:

Data and file formats standardization is crucial for

sharing of data among multiple applications and for exchanging of information among multiple computers.

Sharing is possible if the two vendors agree on the same format. Rapid development of PC industrytext-based files----multifunction formats.

Images obtained are stored on the computer using different formats. A file format is a structure which defines how information is stored in a file and how that information is displayed on the monitor. There are various file formats that can be used universally..can be understood by different operating systems. Some of the image formats that can be used on Macintosh or Windows are :
BMP(Bit Mapped Graphic Image) TIFF GIF JPEG

The multimedia file formats to be discussed: Rich-Text format(RTF) Tagged image file format(TIFF) Resource image file format(RIFF) Musical instrument digital interface(MIDI) Joint photographic experts group(JPEG) Audio video interleaved (AVI) TWAIN

Rich-Text Format(RTF)
Was designed to create a standard format for text with presentation information(color, font, etc.) Not a multimedia format But it is imp for MMS as most messaging systems use text fields for embedding or linking multimedia objects.
Character set Character formatting Font table Color table Section formatting---page break Paragraph formatting Special characters

TIFF for "Tagged Image File Format." It is graphics file Stands

format created in the 1980's to be the standard image format across multiple computer platforms. The Tagged Image File Format which is known for storing multiple images in one document was originally created by Aldus. Adobe Systems, which acquired Aldus, now holds the copyright to the TIFF specification. Since the original TIFF standard was introduced, people have been making many small improvements to the format, so there are now around 50 variations of the TIFF format. Recently, JPEG has become the most popular universal format, because of its small file size and Internet compatibility. File extensions: .TIF, .TIFF This format is widely supported by scanning, faxing and other image manipulation applications.

Tag Image File Format (TIFF) files are used for a diverse set of applications such as GIS (geographic information systems), CAD drawing programs, graphic arts, and so forth

In the tagged file formats, tags are used to keep all the attribute information in a standard manner. The TIFF file format provides tags that store information about resolution, color, the compression scheme used for capturing, date and time of capture, and even the operator who created the file. The search thru the file is quick as the tag locations are found thru the pointers A tagged file allows to add new information any time.

Tag 0

Img 2


1 2


Here, the image is divided into units called tags. An information about where each tag is physically stored is kept in the Image File Directory Table. When we modify any tag/image we will rewrite those tags at available blank space.

RIFF (Resource Interchange File Format) The Resource Interchange File Format (RIFF) is a generic file container framework for storing data in tagged chunks..(Wikipedia) Provides a framework for multimedia file formats for Microsoft Windows based applications. Not a file formatbut a frameworkwhich is filled by user given data. RIFF consists of block of data called chunks..similar to image file directory entry of TIFF.

Each RIFF chunk contains :-A four-character string ID called a tag 4-bytes containing size of data. Form type Actual data

Three types of chunks:

RIFF chunk Sub chunk List chunk

Each sub chunk contains :-Allows to add more A ,when informationfour-character string ID called a primary chunk is not tag sufficient 4-bytes containing size of data. Actual data

Allows to add more f information like:-Copyright informatio Creation date

RIFF Can store bpm , jpeg, and other image formats also AVI,WAV formats



Sub List chunks ch

TIFF True file system

RIFF Not a true file system, but a framework Can store images, audio, video

Can store only images

Can store bitmap images only

Can store bpm , jpeg, and other image formats

Information is divided in chunks

Information is divided in tags

A tag is small in size comparatively Chunk can be as big as 4 Gb (Kb)

MIDI File Format:

MIDI (Musical Instrument Digital Interface)is an industry-standard protocol that enables electronic musical instruments (synthesizers, drum machines), computers and other electronic equipment (MIDI controllers, sound cards, samplers) to communicate and synchronize with each other. Computers that have a MIDI interface can record sounds created by a synthesizer and then manipulate the data to produce new sounds.

Professional musicians use sound recording systems that record a song with multiple compositions of voice and music. Each composition of the song is mastered on a separate track. After all the tracks are completely recorded and edited, they are superimposed simultaneously to play simultaneously for cutting the CD. The MIDI file format follows the same strategy to store separate tracks of music for each instrument so that they can be read and synchronized when they are played.

Like RIFF, the MIDI format contains ID chunks (blocks) of data. Size Two types:-Data

Header chunk Track chunk

ID Sizetrack length Data