You are on page 1of 7

1.

1 DATA REPRESENTATION AHMED THAKUR


1.1.5 COMPRESSION TECHNIQUES

 Show understanding of how digital data can be compressed, using either ‘lossless’ (including
runlength encoding – RLE) or ‘lossy’ techniques

Compression

Why compress files?


Processing power and storage space is very valuable on a computer. To get the best out of both, it

R
can mean that we need to reduce the file size of text, image and audio data in order to transfer it
more quickly and so that it takes up less storage space.

KU
In addition, large files take a lot longer to download or upload which leads to web pages, songs and
videos that take longer to load and play when using the internet.

Any kind of data can be compressed. There are two main types of compression: lossy and lossless.

Lossy compression
Lossy compression removes some of a file’s original data in order to reduce the file size. This might

A
mean reducing the numbers of colours in an image or reducing the number of samples in a sound
file. This can result in a small loss of quality of an image or sound file.

A popular lossy compression method for images is the JPEG, which is why most images on the internet
are JPEG images. A popular lossy compression method for sounds is MP3. Once a file has been
TH
compressed using lossy compression, the discarded data cannot be retrieved again.

Lossless compression
Lossless compression doesn’t reduce the quality of the file at all. No data is lost, so lossless
compression allows a file to be recreated exactly as it was when originally created.

There are various algorithms for doing this, usually by looking for patterns in the data that are
repeated. Zip files are an example of lossless compression.
ED

The space savings of lossless compression are not as good as they are with lossy compression.
HM
A

Lossy Lossless
Data Compression
In digital signal processing, data compression, source coding, or bit-rate reduction involves
encoding information using fewer bits than the original representation. Compression can be either
lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical
redundancy. No information is lost in lossless compression. Lossy compression reduces bits by
identifying unnecessary information and removing it. The process of reducing the size of a data file
is referred to as data compression.

COMPUTER SCIENCE https://www.facebook.com/groups/OAComputers/


ahmed_thakur@hotmail.com, 0300-8268885 Page 1
9608
1.1 DATA REPRESENTATION AHMED THAKUR
1.1.5 COMPRESSION TECHNIQUES

Image Compression
Image compression may be lossy or lossless. Lossless compression is preferred for archival purposes
and often for medical imaging, technical drawings, clip art, or comics. Lossy compression methods,
especially when used at low bit rates, introduce compression artifacts. Lossy methods are especially
suitable for natural images such as photographs in applications where minor (sometimes
imperceptible) loss of fidelity is acceptable to achieve a substantial reduction in bit rate. The lossy
compression that produces imperceptible differences may be called visually lossless.

R
Audio Compression
Audio data compression, as distinguished from dynamic range compression, has the potential to

KU
reduce the transmission bandwidth and storage requirements of audio data. Audio compression
algorithms are implemented in software as audio codecs. Lossy audio compression algorithms
provide higher compression at the cost of fidelity and are used in numerous audio applications.
These algorithms almost all rely on psychoacoustics to eliminate less audible or meaningful sounds,
thereby reducing the space required to store or transmit them.

A
In both lossy and lossless compression, information redundancy is reduced, using methods such as
coding, pattern recognition, and linear prediction to reduce the amount of information used to
represent the uncompressed data.

 Lossless audio compression produces a representation of digital data that decompress to an


TH
exact digital duplicate of the original audio stream, unlike playback from lossy compression
techniques such as Vorbis and MP3. Compression ratios are around 50–60% of original size, which
is similar to those for generic lossless data compression. Lossless compression is unable to attain
high compression ratios due to the complexity of waveforms and the rapid changes in sound
forms.

 Lossy audio compression is used in a wide range of applications. In addition to the direct
applications (mp3 players or computers), digitally compressed audio streams are used in most
ED

video DVDs, digital television, streaming media on the internet, satellite and cable radio, and
increasingly in terrestrial radio broadcasts. Lossy compression typically achieves far greater
compression than lossless compression (data of 5 percent to 20 percent of the original stream,
rather than 50 percent to 60 percent), by discarding less-critical data.
HM

Audio Compression
As you can see we have some serious issues with the size of sound files. Take a look at the size of a 3
minute pop song recorded at a sample rate of 44kHz and a sample resolution of 16 bits.

44,000 * 16 * 180 = 126 720 000 bits (roughly 15 MB)

As you are probably aware an mp3 of the same length would be roughly 3Mb, a fifth of the size. So
what gives? It is easy to see that the raw file sizes for sounds are just too big to store and transmit
easily, what is needed it a way to compress them.
A

Lossless
Lossless compression - compression doesn't lose any accuracy and can be decompressed into an
identical copy of the original audio data

WAV files don't involve any compression at all and will be the size of files that you have calculated
already. There are lossless compressed file formats out there such as FLAC which compress the WAV
file into data generally 50% the original size. To do this it uses run length encoding, which looks for
repeated patterns in the sound file, and instead of recording each pattern separately, it stores

COMPUTER SCIENCE https://www.facebook.com/groups/OAComputers/


ahmed_thakur@hotmail.com, 0300-8268885 Page 2
9608
1.1 DATA REPRESENTATION AHMED THAKUR
1.1.5 COMPRESSION TECHNIQUES

information on how many times the pattern occurs in a row. Let us take a hypothetical set of sample
points:

0000000000000000000001234543210000000000000000000123456787656789876

As you can see the silent area takes up a large part of the file, instead of recording these individually

R
we can set data to state how many silent samples there are in a row, massively reducing the file size:

(21-0)123454321(17-0)123456787656789876

KU
Another technique used by FLAC files is linear prediction.

Lossy
FLAC files are still very large, what is needed is a format that allows you to create much smaller file
sizes that can be easily stored on your computer and portable music device, and easily transmitted

A
across the internet.

Lossy compression - compression loses file accuracy, generally smaller than lossless compression
TH
As we have already seen, to make smaller audio files we can decrease the sampling rate and the
sampling resolution, but we have also seen the dreadful effect this can have on the final sound.
There are other clever methods of compressing sounds, these methods won't let us get the exact
audio back that we started with, but will be close. This is lossy compression.
ED
HM

There are many lossy compressed audio formats out there including: MP3, AAC and OGG (which is open
source). The compression works by reducing accuracy of certain parts of sound that are considered to
be beyond the auditory resolution ability of most people. This method is commonly referred to
as perceptual coding. It uses psychoacoustic models to discard or reduce precision of components less
audible to human hearing, and then records the remaining information in an efficient manner. Because
A

the accuracy of certain frequencies are lost you can often tell the difference between the original and the
lossy versions, being able to hear the loss of high and low pitch tones.

Exercise: Sound compression

Question. Why is it necessary to compress sound files?


Answer: So that they take up less space and can be sent quickly across the internet or stored on
portable music players

Question: Name the two categories of compression available and give a file format for each

COMPUTER SCIENCE https://www.facebook.com/groups/OAComputers/


ahmed_thakur@hotmail.com, 0300-8268885 Page 3
9608
1.1 DATA REPRESENTATION AHMED THAKUR
1.1.5 COMPRESSION TECHNIQUES

Answer: Lossy (mp3/AAC/ogg) and lossless(FLAC)

Question: perform run length encoding on the following sound file


012344444444444432222222222222211111111111111000000000000
Answer: 0123(11-4)3(13-2)(14-1)(11-0)

Question: Describe a technique used to compress mp3 files

R
Answer: perceptual coding reduces the quality of frequencies stored in a sound file that are
beyond the auditory resolution of most people

KU
Question: When would it be best to use FLAC instead of ogg and vice-versa?
Answer: When you really care about the sound quality and you're not bothered about the file size
When you are trying to make a sound file as small as possible

Nyquist Theorem
We have seen the various ways that you can reduce the size of files, we have also seen that humans

A
have a limit to the frequencies that they can perceive, so what sampling rate would be needed to
only store the samples that humans can perceive. The full range of human hearing is between 20 Hz
and 20 kHz.

Extension: Human hearing limit


TH
People are able to hear different frequencies, up to what level can you hear?

For x = 0 To 25
Console.WriteLine("Can you hear: " & x * 1000 & "Hz?")
Console.Beep(x * 1000, 500)
Next
ED

You lose your hearing with age, so the older you are the less likely you are to be able to hear the
full spectrum.

So why not just use 20kHz as our sampling rate record 20k cycles per second and be done with it?
There is a small problem:
HM

Cycle - A complete oscillation (up and down) in a sound wave

Period - The time that a wave takes to oscillate one cycle.

Frequency - The number of waves passing a point per second

Run-Time Encoding
The run-time requirements for encoding are substantially depending on the desired usage. To point
this out it is helpful to compare the encoders requirements for broadcasting and for DVD-Video
production.
A

Example: Broadcasting
Within a production chain for a live transmission the efforts for encoding may not exceed a specific
delay time. To avoid synchronization problems this has to be regarded for both audio and video.
Both signals must be coupled together before broadcasting. Procedures matching this requirements
will be called real-time or synchronous processes.

Example: DVD-Video
The encoding time is not important for DVD production, it may take much longer than for decoding.
Only the contrary aspects quality and financial expenditures has to be regarded. As the time

COMPUTER SCIENCE https://www.facebook.com/groups/OAComputers/


ahmed_thakur@hotmail.com, 0300-8268885 Page 4
9608
1.1 DATA REPRESENTATION AHMED THAKUR
1.1.5 COMPRESSION TECHNIQUES

conditions for encoding and decoding are totally independend, procedures of this categorie will
be called asynchronous processes. The example DVD production covers both time conditions and
complexity of the algorithm (system performance).

R
KU
A
TH
ED

What we need to properly represent a sound wave is to sample it at least two times per cycle
HM
A

Therefore the minimum sampling rate that satisfies the sampling for the human ear is 40 kHz (2*20kHz).
The 44.1 kHz sampling rate used for Compact Disc was chosen for this and other technical reasons.

Nyquist's theorem - the sample rate should be at a frequency which is at least twice the value of
the highest frequency in the sampled signal
https://en.wikibooks.org/wiki/A-
level_Computing/AQA/Problem_Solving,_Programming,_Data_Representation_and_Practical_Exer
cise/Fundamentals_of_Data_Representation/Nyquist-theorem

COMPUTER SCIENCE https://www.facebook.com/groups/OAComputers/


ahmed_thakur@hotmail.com, 0300-8268885 Page 5
9608
1.1 DATA REPRESENTATION AHMED THAKUR
1.1.5 COMPRESSION TECHNIQUES

Video Compression
Video compression uses modern coding techniques to reduce redundancy in video data. Most
video compression algorithms and codecs combine spatial image compression and temporal
motion compensation. Video compression is a practical implementation of source coding in
information theory. In practice, most video codecs also use audio compression techniques in parallel
to compress the separate, but combined data streams as one package.

The majority of video compression algorithms use lossy compression. Uncompressed video requires a

R
very high data rate. Although lossless video compression codecs perform an average compression
of over factor 3, a typical MPEG-4 lossy compression video has a compression factor between 20
and 200.[24] As in all lossy compression, there is a trade-off between video quality, cost of processing

KU
the compression and decompression, and system requirements. Highly compressed video may
present visible or distracting artifacts.

Codec
A codec is a device or computer program capable of encoding or decoding a digital data stream
or signal. Codec is a portmanteau of coder-decoder or, less commonly, compressor-decompressor.

A
A codec encodes a data stream or signal for transmission, storage or encryption, or decodes it for
playback or editing. Codecs are used in videoconferencing, streaming media and video editing
applications. A video camera's analog-to-digital converter (ADC) converts its analog signals into
digital signals, which are then passed through a video compressor for digital transmission or storage.
TH
A receiving device then runs the signal through a video decompressor, then a digital-to-analog
converter (DAC) for analog display.

 Audio Codec
An audio codec is a device or computer program capable of coding or decoding a digital data
stream of audio.

In software, an audio codec is a computer program implementing an algorithm that compresses


ED

and decompresses digital audio data according to a given audio file or streaming media audio
coding format. The objective of the algorithm is to represent the high-fidelity audio signal with
minimum number of bits while retaining the quality. This can effectively reduce the storage space
and the bandwidth required for transmission of the stored audio file. Most codecs are
implemented as libraries which interface to one or more multimedia players.
HM

In hardware, audio codec refers to a single device that encodes analog audio as digital signals
and decodes digital back into analog. In other words, it contains both an Analog-to-digital
converter (ADC) and Digital-to-analog converter (DAC) running off the same clock. This is used
in sound cards that support both audio in and out, for instance.

 Video Codec
A video codec is an electronic circuit or software that compresses or decompresses digital video,
thus converting raw (uncompressed) digital video to a compressed format or vice-versa. In the
context of video compression, "codec" is a concatenation of "encoder" and "decoder"; a device
that can only compress is typically called an encoder, and one that can only decompress is
A

known as a decoder.

The format of the compressed data usually conforms to a standard video compression
specification. The compression is typically lossy, meaning that the compressed video lacks some
of the information present in the original video. A consequence of this is that decompressed
video has lower quality than the original, uncompressed video because there is insufficient
information to accurately reconstruct the original video.

There are complex relationships between the video quality, the amount of data used to
represent the video (determined by the bit rate), the complexity of the encoding and decoding

COMPUTER SCIENCE https://www.facebook.com/groups/OAComputers/


ahmed_thakur@hotmail.com, 0300-8268885 Page 6
9608
1.1 DATA REPRESENTATION AHMED THAKUR
1.1.5 COMPRESSION TECHNIQUES

algorithms, sensitivity to data losses and errors, ease of editing, random access, and end-to-end
delay (latency).

R
KU
A
TH
ED
HM
A

COMPUTER SCIENCE https://www.facebook.com/groups/OAComputers/


ahmed_thakur@hotmail.com, 0300-8268885 Page 7
9608

You might also like