You are on page 1of 21

Data Storage and Compression

Data Storage

Name of Memory Size Number of byte Name of Memory Size Number of byte
1 kilobyte (1KB) 103 = 1000 1 kibibyte (1KiB) 210 = 1024
1 megabyte (1MB) 106 = 1000 * 1000 1 mebibyte (1MiB) 220 = 1024 * 1024
1 gigabyte (1GB) 109 = 1000*1000*1000 1 gibibyte (1GiB) 230 = 1024*1024*1024
1 terabyte (1TB) 1012 = 1000*1000*1000 1 tebibyte (1TiB) 240 = 1024*1024*1024
*1000 *1024
1 petabyte (1PB) 1015 = 1000*1000*1000 1 pebibyte (1PiB) 250 = 1024*1024*1024
*1000*1000 *1024*1024
1 exabyte (1 EB) 1018 = 1000*1000*1000 1 exbibyte (1 EiB) 260 = 1024*1024*1024
*1000*1000*1000 *1024*1024*1024

Internation System of Measurement The international Electrotechnical


(Decimal Prefix) Commission IEC (Binary Prefix)
It is based on the 2 system of units where
It is based on the base 10 system of units
1 kib is equal to 1024
where 1 kilo is equal to 1000
Data Storage

 As computer are binary, the binary prefix is the most exact, but the decimal prefix units
are sill used because it is far more easier to multiply and divide by 1000 than it is by 1024

Example
A hard disk is described as having a storage capacity of 1.5 TB . What is that in megabytes

a storage capacity = 1.5 TB


Decimal prefix
a storage capacity (in MB) = 1.5 * 1000 (1000)
Byte
= 1500 GB * 1000 KB
= 1500000 MB MB
Another Method GB
TB
a storage capacity (in MB) = 1.5 * 1000 * 1000 PB
EB
= 1500000 MB
Data Storage

An image file has a file size of 363143213 bits. Create an expression to convert this size to
mebibytes (MiB) and show the results.

File size (in mebibytes) = 363143213 / 8 = 45392902 byte / 2 20


Binary prefix (1024)
= 45392902 / 1024 * 1024 Byte
KiB
= 43 MIB MiB
GiB
TiB
File size (in mebibytes) = 363143213 = 43 MiB PiB
8 * 1024 * 1024 EiB
• An image is 2322 pixels high and 4128 pixels wide. The image is stored with a 16-bit colour depth.
Construct an expression to show how the file size, in megabytes, is calculated. You do not need to
do the calculation

File size of image = 2322 * 4128 * 16


8
2322 * 4128 * 16
8

2322 * 4128 * 16
8

1000 * 1000
• A photograph is 1024 * 1080 and use color depth of 32 bits. How many photographs of
this size would fit onto a memory stick of 64 GiB.

The size of photograph (in byte) = 1024 * 1080 * 32


8

= 4423680 Byte

The size of memory stick (in byte) = 64 * 2 30

= 64 *1024 * 1024 * 1024

= 68719476736 Bytes

The number of photographs would =68719476736/4423680


fit onto a memory stick
= 15534 photos
An audio CD has a sample rate of 44100 and a sample resolution of 16 bits. The music
being sampled uses two channels. Calculate the file sizes for 60 minute recording in MB.

File size of digital audio file = sample rate * bit depth * duration(in seconds ) * number of channels

File size of digital audio file (in MB) = 44100 * 16 * (60*60) * 2


= 5080320000 bit
8 Page No. 34
Act – 1.16 (1 to 4)
= 635040000 byte

106
= 635040000 byte
1000 * 1000

= 635 MB
Exercise
1. A hard disk is described as having a storage capacity of 1.5 TB . What is that in kilobyte .
2. A camera detector has an array of 1920 by 1536 pixels. A color depth of 16 bit is used.
Calculate the size of photograph take by this camera, given your answer in MB.
3. Photographs have been taken by a smartphone which uses a detector with a 1024* 1536 pixel
array. The software use color depth of 24 bits. How many photographs could be stored on a
640 MB memory card?
4. The typical song stored on a music CD is 3 minutes and 30 seconds. Assuming each song is
sampled at 44100 Hz and 16 bits used per sample. Each song utilizes two channels. Calculate
how many typical songs could be stored on a 740 MiB.

5. Pg . 42
No.5 (a,b,c)
Pg. 43
No. 8 (a)
Compression
The sound and image file size can be very large.
Therefore it is necessary to reduce ( or compress) the size of file for the following reason
• To save storage space
• To reduce transmission time

Compression : Changing the format of a data file so that the size of the file become smaller.
There are two type of compression
• Lossless
• Lossy
Compression

Lossless compression
• When lossless compression is used, no data is lost and the original file can be
restored.
• This is particularly important for files where any loss of data would be disastrous
• Lossless compression is used for text file and executable file (source code) as
missing data would completely change the meaning so that it could not be
understand.
Compression
Lossless Compression
Run-Length Encoding (RLE)
• RLE is used to reduce the size of a repeating string of items.
• Is a form of lossless file compression
• The repeating string is called a run and is represented by two bytes :
• The first byte represents the number of times the item of information is repeated.
• The second byte represents the item of information
• RLE is only effective where there is a long run of repeated unit/bit.
• The file may not be compressed very much at all if the characters are not repeated.

Example One byte per each letter is


used to encode.
aaaaabbbbccdddddd = 17 bytes

RLE version
05 97 04 98 02 99 06 100 = 8 bytes
Compression

Run-Length Encoding (RLE)


• One issue occurs with a string such as ‘aaaaaaaa bbbbbbbbbb cdcdcd eeeeeeee’ where RLE
compression isn’t very effective.
• aaaaaaaa bbbbbbbbbb cdcdcd eeeeeeee = 32 bytes
RLE encoding
08 97 10 98 01 99 01 100 01 99 01 100 01 99 01 100 08 101 = 18 bytes
Compression

Run-Length Encoding (RLE)


• To cope with this, use a flag.
• A flag preceding data indicates that what follows are the number of repeating units
(for example, 255 03 97 where 255 is the flag and the other two numbers indicate that
there are five items with ASCII code 97)

aaaaaaaa bbbbbbbbbb cdcdcd eeeeeeee = 32 bytes

255 08 97 255 10 98 99 100 99 100 99 100 255 08 101 = 15 bytes


Compression

Run-Length Encoding
• For black and white image, RLE would be effective
• A color image in which there are very short runs of different colors would not be encoded
as effectively.
11000111 11011111 11011111

RLE version
02 1 03 0 05 1 01 0 07 1 01 0 05 1

To demonstrate RLE, the letter 1 and 0 are


used to represent as white and black.
Exercise

1. To demonstrate RLE, the letter 1 and 0 are used to


represent as white and black.
2. For the given string
abcd eeee dddd fffffffff aaaa bc
(i) encode RLE (without flag)
(ii) encode RLE (with flag)
Compression

Lossy Compression

 It decrease file size by deleting some of the data.


 Therefore, the original file cannot be re-formed entirely when it is decompressed.
 It cannot used with text or program files
 It can be used for bitmap image and audio file where we often cannot notice that data
has been removed.

JPEG

 JPEG is a lossy file compression algorithm used for bitmap images.


 When it find tiny color differences, it give them the same colors value and then it can
rewrite the file using fewer bits.
 It reduce resolution and color depth in image.
 JPEG with extension jpg use lossy compression.
Compression

Lossy Compression

Audio File
 Much of the data in an audio file encodes tone and frequencies that
our ears cannot hear and small differences in volume and
frequency.
 The lossy compression remove this redundant and excess data
 MP3, MPEG,MP4 used lossy compression
MP3
• a lossy file compression method used for music files.
• will reduce the size of a normal music file by about 90%.

How to compress sound file


• Compression algorithm is used
• Use PERCEPTUAL MUSIC SHAPING which removes sounds that the
human ear can’t hear properly.
• The sample rate and sample resolution could be reduced
• if two sounds are played at the same time, only the louder one can be heard by
the ear, so the softer sound is eliminated.
• Unnecessary data are permanently eliminated and the original file is lost after
compression.
MP4
• MP4 is a lossy file compression algorithm used for multimedia data.
• Allow the storage of multimedia files such as Music, videos, photos and
animation can all be stored in the MP4 format
• Video streaming on internet are usually in MP4 format.

How to compress Video file

• A compression algorithm is used


• The resolution and color depth could be reduced
• Use PERCEPTUAL MUSIC SHAPING which removes sounds
that the human ear can’t hear properly.
Disadvantages of compression
 Can affect the quality of the file
 Compressing / decompressing the file takes execution
time
 Both compressing and decompressing have to be
done with compatible software

You might also like