You are on page 1of 18

LOSSY AND LOSSLESS COMPRESSION TECHNIQUES

Data compression is the function of presentation layer in OSI reference model.


Compression is often used to maximize the use of bandwidth across a network or to optimize
disk space when saving data.
There are two general types of compression algorithms: 
1. Lossless compression
2. Lossy compression

                         

Lossless Compression
Lossless compression compresses the data in such a way that when data is
decompressed it is exactly the same as it was before compression i.e. there is no loss of data.
A lossless compression is used to compress file data such as executable code, text
files, and numeric data, because programs that process such file data cannot tolerate mistakes
in the data.
Lossless compression will typically not compress file as much as lossy compression
techniques and may take more processing power to accomplish the compression.

Lossless Compression Algorithms


The various algorithms used to implement lossless data compression are :
1. Run length encoding
2. Differential pulse code modulation
3. Dictionary based encoding
1. Run length encoding
• This method replaces the consecutive occurrences of a given symbol with only one copy of
the symbol along with a count of how many times that symbol occurs. Hence the names ‘run
length'.
• For example, the string AAABBCDDDD would be encoded as 3A2BIC4D.
• A real life example where run-length encoding is quite effective is the fax machine. Most
faxes are white sheets with the occasional black text. So, a run-length encoding scheme can
take each line and transmit a code for while then the number of pixels, then the code for black
and the number of pixels and so on.
• This method of compression must be used carefully. If there is not a lot of repetition in the
data then it is possible the run length encoding scheme would actually increase the size of a
file.
2. Differential pulse code modulation
• In this method first a reference symbol is placed. Then for each symbol in the data, we place
the difference between that symbol and the reference symbol used.
• For example, using symbol A as reference symbol, the string AAABBC DDDD would be
encoded as AOOOl123333, since A is the same as reference symbol, B has a difference of 1
from the reference symbol and so on.
3. Dictionary based encoding
• One of the best known dictionary based encoding algorithms is Lempel-Ziv (LZ)
compression algorithm.
• This method is also known as substitution coder.
• In this method, a dictionary (table) of variable length strings (common phrases) is built.
• This dictionary contains almost every string that is expected to occur in data.
• When any of these strings occur in the data, then they are replaced with the corresponding
index to the dictionary.
• In this method, instead of working with individual characters in text data, we treat each
word as a string and output the index in the dictionary for that word.
• For example, let us say that the word "compression" has the index 4978 in one particular
dictionary; it is the 4978th word is usr/share/dict/words. To compress a body of text, each time
the string "compression" appears, it would be replaced by 4978.

Lossy Compression
Lossy compression is the one that does not promise that the data received is exactly
the same as data send i.e. the data may be lost.
This is because a lossy algorithm removes information that it cannot later restore.
Lossy algorithms are used to compress still images, video and audio.
Lossy algorithms typically achieve much better compression ratios than the lossless
algorithms.
Audio Compression
• Audio compression is used for speech or music.
• For speech, we need to compress a 64-KHz digitized signal; For music, we need to
compress a 1.411.MHz signal
• Two types of techniques are used for audio compression:
1. Predictive encoding
2. Perceptual encoding
                        
Predictive encoding
• In predictive encoding, the differences between the samples are encoded instead of
encoding all the sampled values.
• This type of compression is normally used for speech.
• Several standards have been defined such as GSM (13 kbps), G. 729 (8 kbps), and G.723.3
(6.4 or 5.3 kbps).
Perceptual encoding
• Perceptual encoding scheme is used to create a CD-quality audio that requires a
transmission bandwidth of 1.411 Mbps.
• MP3 (MPEG audio layer 3), a part of MPEG standard uses this perceptual encoding.
• Perceptual encoding is based on the science of psychoacoustics, a study of how people
perceive sound.
• The perceptual encoding exploits certain flaws in the human auditory system to encode a
signal in such a way that it sounds the same to a human listener, even if it looks quite
different on an oscilloscope.
• The key property of perceptual coding is that some sounds can mask other sound. For
example, imagine that you are broadcasting a live flute concert and all of a sudden someone
starts striking a hammer on a metal sheet. You will not be able to hear the flute any more. Its
sound has been masked by the hammer.
• Such a technique explained above is called frequency masking-the ability of a loud sound in
one frequency band to hide a softer sound in another frequency band that would have been
audible in the absence of the loud sound.
• Masking can also be done on the basis of time. For example: Even if the hammer is not
striking on a metal sheet, the flute will be inaudible for a short period of time because the ears
turn down its gain when they start and take a finite time to turn up again.
• Thus, a loud sound can numb our ears for a short time even after the sound has stopped.
This effect is called temporal masking.

MP3
• MP3 uses these two phenomena, i.e. frequency masking and temporal masking to compress
audio signals.
• In such a system, the technique analyzes and divides the spectrum into several groups. Zero
bits are allocated to the frequency ranges that are totally masked.
• A small number of bits are allocated to the frequency ranges that are partially masked.
• A larger number. of bits are allocated to the frequency ranges that are not masked.
• Based on the range of frequencies in the original analog audio, MP3 produces three data
rates: 96kbps, 128 kbps and 160 kbps.
MultiMedia Systems/Multimedia Input & Output
Technologies/MultiMedia Systems
Hardware for Multimedia

 1 Input and Output Devices


o 1.1 Key devices for multimedia output
o 1.2 Monitors
o 1.3 Speakers and midi interfaces
o 1.4 Alphanumeric keyboards and optical character recognition
o 1.5 Digital cameras and scanners
o 1.6 Video Camera and Frame Grabbers
o 1.7 Microphones and MIDI keyboards
o 1.8 Mice, Trackballs, Joy sticks, Drawing tablets
o 1.9 CD-ROMs and Video Disks

Input and Output Devices


Key devices for multimedia output

 Monitors for text and graphics (still and motion)


 Speakers and midi interfaces for sound
 Specialized helmets and immersive displays for virtual reality
 Key devices for multimedia input
 Keyboard and ocr for text
 Digital cameras, scanners, and cd-roms for graphics
 midi keyboards, cd-roms and microphones for sound
 Video cameras, cd-roms, and frame grabbers for video
 Mice, trackballs, joy sticks, virtual reality gloves and wands, for spatial data
 Modems and network interfaces for network data

Monitors

 Most important output device


 Provides all the visual output to the user
 Should be designed for the highest quality image, with least distortion
 Large vacuum tube with electron gun at one end aimed at a large surface
(viewing screen) on the other end
 Viewing screen is coated with chemicals that glow with di�erent colors; three
different phosphors are used for color screens
 Source of electron beam is electrically negative pole or cathode (hence the
name Cathode Ray Tube, or CRT)
 Two different sets of colors used in monitors ** rgb and cmy, with either set
capable of full color spectrum
 Electron beam strikes the screen many times per second
 Phosphors are re-excited at each electron strike for a brief instance
 Refresh rate, measured in Hz
 Preferred refresh rate is 75 Hz or more
 Electron beam sweeps across the screen in a regular pattern
 Required to refresh phosphors frequently and equally
 Raster scan pattern
 Always strikes when going from left to right (trace), and turned on to
go from right to left (retrace)
 Three separate electron beams for three colors, for better focus and higher
refresh rates
 Screen divided into individual picture elements, or pixels
 Each pixel is made of its own phosphor elements to give the color
 Memory chip contains a map of what colors to display on each pixel
 Bit map
 Mostly used in context of binary images (black or white)
Hardware for Multimedia 20

 One bit per pixel to indicate whether pixel is black or white


 Color maps, or pixmap
 One byte for each color for every pixel (24-bit color)
 Image changed in the memory map associated with screen
 For realistic motion images and for flicker-free screen, bit-map must
be modi�ed faster than the eye can perceive (30 frames/sec)
 For a 640 � 480 screen, number of bits is: 640 � 480 � 24 = 7; 372;
800
 To refresh the screen at 30 times per second, the number of bits
transferred in a second is: 640 � 480 � 24 � 30 = 221; 184; 000 or 221 Mb
 Larger screen requires more data to be transferred
 Transfer rate limitation can be overcome by using hardware accelerator
board to perform certain graphic display functions in hardware
 Full-screen 30 image per second performance may not be possible
even with graphics accelerator board
 Physical size of monitor
 Important factor in the quality of multimedia presentation
 Typically between 11 and 20 inches on diagonal
 Another important factor is the number of pixels per inch
� Too few pixels make the image look grainy � For best quality images, pixels should not
be wider than 0.01 inches (28mm) in diameter � Latter quantity is used for marketing the
monitors (25mm dot pitch)


 Graphics display board
 Used in addition to monitor to speed up graphics
 Special hardware circuits for 2D and 3D graphics
 Simple graphics boards just translate image data from ram into one
usable by monitor
 Complex boards can even speed up the refresh rate of screen
 Qualities of a good multimedia monitor
 Size, refresh rate, dot pitch
 Other concerns about monitor include weight and ambient light
 Liquid crystal display monitors
 Flat screen displays
 Crystals allow more or less light to pass through them, depending upon
the strength of an electric field
 Not appropriate for multimedia presentation as the view angle is
extremely important
 3D monitors in the future
 Human factor concerns

Speakers and midi interfaces

 Production of sound
1. Digitized representation of frequency and sound transmitted at appropriate time to the
loudspeaker (.WAV �les) ** common method 2. Commands for sound synthesis can be
transmitted to a synthesizer at appropriate time (midi �les) ** used for the generation of
music
 Musical Instruments Digital Interface (midi)
 Standard to permit interface for both hardware and control logic
between computers and music
synthesizers



 Adopted in 1982
Hardware for Multimedia 21



 Consists of two parts
1. Hardware standard � Speci�es cables, circuits, connectors, and electrical signals to be
used 2. Message standard � Types and formats of messages to be transmitted to/from
synthesizers, control units (keyboards), and computers � Messages consist of a device
number, a control segment to tell the device the function to be performed (turn on/o� a
speci�ed circuit), and a data segment to provide the information necessary for the action
(volume of sound, or frequency of basic sound

 An entire piece of music can be described by a sequence of midi


messages
 midi interface
 Required in the computer to communicate with midi instruments
 Circuit board to translate the signals

Alphanumeric keyboards and optical character recognition

 Used for textual input


 Pressing a key on a keyboard closes a circuit corresponding to the key to send
a unique code to the cpu
 Printed text can be input using ocr software
 ocr software analyzes an image to translate symbols into character
codes
 Systematically checks the entire page, searching for patterns of dark
and light recognizable as alphabetic,numeric, or punctuation characters

 Choose the best match from a set of known patterns


 Quality of scanned page as well as output
Digital cameras and scanners
Real image and Digital image (Representation of real image in terms of pixels)

 Still image
 Snapshot of an instance
 Motion image
 Sequence of images giving the impression of continuous motion
 Graininess in real images (Individual dots observed when a photograph taken by
conventional camera and enlarged)
Digital image capture

 Light is focused on photosensitive cells to produce electric current in response to


intensity and wavelength of light
 Electric current is scanned for each point on the image and translated to binary codes
 Codes correspond to pixel values and can be used to rebuild the original picture
Scanners scan an image from one end to the other

 Scanning mechanism shines bright light on the image and codes and records the
reflected light for each point
 Scanner does not store data but sends it to the computer, possibly after compression of
the same
Quality of images

 Depends on the quality of optics and sharpness of focus


 Perceived by sharpness of resulting image
 Accuracy of encoding for each pixel depends on the precision of photosensitive cells
 Resolution of scanner/camera (number of dots/inch)
 Amount of storage available
Hardware for Multimedia

 Preferable to scan at the highest possible resolution under given hardware and storage
space constraints to get the most detail in the original image

Video Camera and Frame Grabbers


Standard video camera contains photosensitive cells, scanning one frame after another.
Output of the cells gets recorded as analog stream of colors, or sent to digiting circuitry to
generate a stream of digital codes

 Video input card


 Required for use of video camera to input video stream into computer
 Digitizes the analog signal from camera
 Output can be sent to a file for storage, cpu for processing, or monitor for
display (or all of them)
 Frame grabber
 Allows the capture of a single frame of data from video stream
 Not as good resolution as a still camera
 Typical frame grabbers process 30 frames per second for real time
performance

Microphones and MIDI keyboards


These are used to input original sounds (analog)

 Microphone has a diaphragm that vibrates in response to sound waves


 Vibrations modulate a continuous electric current analogous to sound waves
 Modulated current can be digitized and stored as standardized format for audio data,
such as .WAV �le
 Microphone plugs into a sound input board
 Developer can control the sampling rate for digitizing
 Higher sampling rate gives better fidelity but requires more space
 Sampling rate for music ** 20,000 Hz
 Sampling rate for speech ** 10,000 Hz
 Editing digital audio files (cut and paste)using Audio softwares like
Cooledit,Audacity etc

Mice, Trackballs, Joy sticks, Drawing tablets


These are used to enter positional information as 2D or 3D data from a standard reference
point

 Latitude, longitude, altitude


 Common to de�ne a point on the computer screen
 Mouse de�nes the movement in terms of two numbers ** left/right and up/down on
the screen, with respect to one corner
 Movement of mouse is tracked by software, which can also set the tracking speed
 Trackball works the same way as the mouse
 A joystick is a trackball with a handle
 Pressing the button of mouse/trackball/joystick sends a signal to the computer asking
it to perform some function
 Multimedia software should be able to determine the positional information as well as
the signal context (mouse press)

CD-ROMs and Video Disks


This is a Popular media for storage and transport of data. Data written on disk by burning tiny
holes, interpreted as binary 0 and 1 by software. These days Flash drives are USB devices
which are gaining popularity. Features

 Read-only devices; data can be written only once


 CD-roms can typically store about 600MB of information
 With time, the speed has improved (4X in 1995 to more than 50X now)
 DVD-roms allow a few gigabytes of data on a single disk
 Ideal media for distributing multimedia productions (low cost)
Audio
AAC
Advanced Audio Coding (similar to MP3) is a digital audio format designed for high
compression as well as high audio quality.

Like MP3s, Advanced Audio Coding (AAC) files are also lossy audio files. However, AAC
files, in their original state, are much higher in quality than any of the other audio file formats
on the list. AAC files are generally similar in size to MP3s, despite being a tad higher in
quality.

They can also be created with a variable bit rate or constant bit rate. AAC files are also open-
source, which means you don’t need to pay royalties to create and distribute them (unlike
MP3 files).

.AAC files are most commonly associated with iTunes, though they can be used on other
player devices and gaming consoles.

AVI
Audio Video Interleaved is a Windows movie file with high video quality, but a large file
size. Approximately 25 GB is required for 60 minutes of video.

MP3
MPEG 1 Audio Layer 3 is a digital audio format that is designed for high compression of
audio files while maintaining high audio quality.

.MP3 files are the most common audio file around. MP3s feature lossy compression, which
means their quality will degrade over subsequent edits. MP3s are still relatively large in size
when compared to other audio file formats on this list.

MP3 files can be encoded at a constant bit rate or variable bit rate. A constant bit rate ensures
the same quality throughout the audio file, but results in a higher file size. Variable bit rate
detracts from quality during silent or near-silent moments of a file, resulting in a smaller
overall file size. Most smart phones and music players use the MP3 format.

MP3 VBR
MP3 using Variable Bit Rates that provides better quality and smaller files.

Audible 2, 3 and 4
Audio file format (.aa file extension) used for audio books or other voice recordings. Entire
books can be stored in a single file.

Apple Lossless
Uses the .m4a file extension, the same as AAC. Creates a larger file than AAC, but retains
more information and quality.
AIFF
Audio Interchange File Format similar to WAV. AIFF provides original sound quality and
large file size.

WAV
Wave provides the same file sound quality and large file size as the original CD.

Video
H.264
This is a digital video codec noted for high data compression while maintaining high quality.

MPEG-2
A combination of audio and video compression for storage of movies.

Mov
QuickTime Movie Format

m4v
A MPEG-4 Video file.

MP4
MPEG-4 is a versatile file format that can include audio, video, images and animations.

DAT
Digital Data Storage. Data file format that can be used for text, graphics or binary data.

VOB
Video Object is a MPEG-2 DVD video movie file.
Distributed DBMS Architectures
DDBMS architectures are generally developed depending on three
parameters −

 Distribution − It states the physical distribution of data across the different


sites.

 Autonomy − It indicates the distribution of control of the database system


and the degree to which each constituent DBMS can operate independently.

 Heterogeneity − It refers to the uniformity or dissimilarity of the data


models, system components and databases.

Architectural Models
Some of the common architectural models are −

 Client - Server Architecture for DDBMS

 Peer - to - Peer Architecture for DDBMS

 Multi - DBMS Architecture

Client - Server Architecture for DDBMS


This is a two-level architecture where the functionality is divided into
servers and clients. The server functions primarily encompass data
management, query processing, optimization and transaction
management. Client functions include mainly user interface. However,
they have some functions like consistency checking and transaction
management.

The two different client - server architecture are −

 Single Server Multiple Client

 Multiple Server Multiple Client (shown in the following diagram)


Peer- to-Peer Architecture for DDBMS
In these systems, each peer acts both as a client and a server for
imparting database services. The peers share their resource with other
peers and co-ordinate their activities.

This architecture generally has four levels of schemas −

 Global Conceptual Schema − Depicts the global logical view of data.

 Local Conceptual Schema − Depicts logical data organization at each site.

 Local Internal Schema − Depicts physical data organization at each site.

 External Schema − Depicts user view of data.


Multi - DBMS Architectures
This is an integrated database system formed by a collection of two or
more autonomous database systems.

Multi-DBMS can be expressed through six levels of schemas −

 Multi-database View Level − Depicts multiple user views comprising of


subsets of the integrated distributed database.

 Multi-database Conceptual Level − Depicts integrated multi-database that


comprises of global logical multi-database structure definitions.

 Multi-database Internal Level − Depicts the data distribution across


different sites and multi-database to local data mapping.

 Local database View Level − Depicts public view of local data.

 Local database Conceptual Level − Depicts local data organization at each


site.

 Local database Internal Level − Depicts physical data organization at


each site.

There are two design alternatives for multi-DBMS −

 Model with multi-database conceptual level.


 Model without multi-database conceptual level.

You might also like