Data Compression

Data Storage and Compression
Measurement of data storage

● A bit is the basic unit of all
computing memory storage terms
and is either 1 or 0.
● The word comes from binary digit.
The byte is the smallest unit of
memory in a computer. 1 byte is 8
bits. A 4-bit number is called a
nibble – half a byte.
● 1 byte of memory wouldn’t allow you to store information very much so memory
size is measured in the multiples shown in Table 1.4:
● The above system of numbering now only refers to some storage devices but is
technically inaccurate. It is based on the SI (base 10) system of units were
1 kilo is equal to 1000.
● A 1TB hard disk drive would allow the storage of 1 × 1012 bytes according to this
system.
● Since memory size is measured in terms of powers of 2, another
system has been adopted by the IEC (International Electrotechnical Commission)
that is based on the binary system (Table 1.5):
Calculation of file size
Data compression
● Data compression is when the bit structure of a file is manipulated in such a way
that the data in a file will become smaller in size.
● This means that less storage space will be needed to store the file and the file
will be easier to transmit from one device to another.
● Algorithm – a step by step set of instructions.
Data compression
It is therefore necessary to reduce (or compress) the size of a file for the
following reasons:
● to save storage space on devices such as the hard disk drive/solid state drive.
● to reduce the time taken to stream a music or video file.
● to reduce the time taken to upload, download or transfer a file across a network.
● the download/upload process uses up network bandwidth – this is the
maximum rate of transfer of data across a network
● reduced file size also reduces costs. For example, when using cloud storage,
the cost is based on the size of the files stored.
Lossy and Lossless file compression
● Lossy compression is derived from the word ‘loss’ and this refers to the way this
method of compression works. With lossy compression, data that is deemed
redundant or unnecessary is removed in the compression process.
● The data is removed permanently, so it is effectively ‘lost’. This way the size of
the file is reduced.
● Lossy file compression results in some loss of detail when compared to the
original file. The algorithms used in the lossy technique must decide which
parts of the file need to be retained and which parts can be discarded.
● For example, when applying a lossy file compression algorithm to:
» an image, it may reduce the resolution and/or the bit/colour depth

» a sound file, it may reduce the sampling rate and/or the resolution
● Lossy compression is mostly used for multimedia such as audio, video and image
files. This is mostly done when streaming these files, as a file can be streamed
much more effectively if it is smaller in size.
● If a lossy compression method is used on a music file it will try to remove all
background noise and noises that may not be heard by the human ear.
● Lossy files are smaller than lossless files which is of great benefit when
considering storage and data transfer rate requirements.
● Common lossy file compression algorithms are:

» MPEG-3 (MP3) and MPEG-4 (MP4) - Moving Picture Experts Group
» JPEG - Joint Photographic Experts Group
● Lossless compression a method of compression that loses no data in the
process. In lossless compression, the compressed data can be reversed to
reconstruct the data file exactly as it was.
● Lossless compression is used when it is essential that no data is lost or
discarded during the compression process.
● If a lossless compression method is used on a music file it will not lose any of
the data from the file. A possible way to compress the data would be to look for
repeating patterns in the music.
● Lossless compression can also be used when storing text files.
File Formats
● A file format is the method that we choose to store different data on a computer.
Different file formats encode data in different ways. This means that they organise
the data for storage in different ways.
● It is important for software to recognise the file format used to save the data
in order to access it.
● There are many different types of file format. Some are specific to software, and
some are more generic or standard.
File Formats
These are the
most common
file extensions:
File Formats
There are four multimedia standard file formats that you should be aware of:
1. Musical Instrument Digital Interface (MIDI) uses a series of protocols and interfaces
that allow lots of different types of musical instrument to connect and communicate.
MIDI also allows one computer, or instrument, to control other instruments.
2. Joint Photographic Experts Group (JPEG) is a standard format for lossy
compression of images. It can reduce files down to 5% of their original size.
File Formats
3. MP3 is a standard format for lossy compression of audio files.
4. MP4 is a standard format for lossy compression of video files. It can also be used
on other data such as audio and images
MP3 and MP4 have developed from the original file format Motion Picture Experts
Group (MPEG). This is a lossy compression method for video files dating back to
1991.
JPEGs, MP3s and MP4s are used in a wide variety of devices, such as computers,
digital cameras, DVD/Blu-Ray players and smartphones to store content.
Activity – Answer in foolscap paper.
1.
2.

Data Compression

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Compression

Uploaded by

Copyright:

Available Formats

Data Storage and Compression

Measurement of data storage

» an image, it may reduce the resolution and/or the bit/colour depth

● Common lossy file compression algorithms are:

You might also like