You are on page 1of 120

# Chapter 2

## Multimedia Information Representation

Multimedia information
representation

2.1 Introduction
2.2 Digitization principles
2.3 Text
2.4 Images
2.5 Audio
2.6 Video

Introduction
The conversion of an analog signal into a
digital form
Signal encoder, sampling, signal decoder

## 2.2 Digitization principles

2.2.1 Analog signals
Fourier analysis can be used to show that any
time-varying analog signal is made up of a
possibly infinite number of single-frequency
sinusoidal signals whose amplitude and phase
vary continuously with time relative to each other
Signal bandwidth
Fig2.1
The bandwidth of the transmission channel
should be equal to or greater than the bandwidth
of the signalbandlimiting channel

## Multimedia Information Representation

Multimedia Information is stored and processed
within a computer in a digital form.
Codeword: Combination of a fixed number of bits that
represents each character, in the case of textual
information.
Analog signal: Signal whose amplitude (magnitude of
the sound/image intensity) varies continuously with
time.
Signal encoder: Electrical circuit used for the
conversion of an analog signal into a digital form.
Signal decoder: Electrical circuit that converts stored
digitized samples into time-varying analogue form.

Analog Signals
As mentioned earlier the amplitude of the signal
varies continuously with time
The Fourier analysis can be used to show that any
time varying signal is made up of infinite number of
single-frequency sinusoidal components
The range of frequencies of the sinusoidal
components that make up the signal is called the
signal bandwidth
Speech bandwidth: 50Hz 10kHz
Music Bandwidth: 15Hz 20kHz

## To transmit an analogue signal through a network

the bandwidth of the transmission channel should be
equal to or greater than the signal bandwidth
If the bandwidth of the channel is less than the
signal bandwidth than channel is called the
bandlimiting channel

Encoder Design

## 2.2.2 Encoder design

A bandlimiting filter and an analog-to-digital
converter(ADC), the latter comprising a sampleand-hold and a quantizer
Fig2.2
Remove selected higher-frequency components
from the source signal (A)
(B) is then fed to the sample-and-hold circuit
Sample the amplitude of the filtered signal at
regular time intervals (C) and hold the sample
amplitude constant between samples (D)

## 2.2.2 Encoder design

Quantizer circuit which converts each
sample amplitude into a binary value
known as a codeword (E)
The signal to be sampled at a rate which
is higher than the maximum rate of
change of the signal amplitude
The number of different quantization
levels used to be as large as possible

## 2.2.2 Encoder design

Nyquist sampling theorem states that:
in order to obtain an accurate
representation of a time-varying analog
signal, its amplitude must be sampled
at a minimum rate that is equal to or
greater than twice the highest
sinusoidal frequency component that is
present in the signal.

Encoder Design

## Bandlimiting filter: Removes the selected higher

frequency components from the source signal
Sample and hold Circuit: Samples amplitude of the
filtered signal at regular intervals and holds the
sampled amplitudes between samples
Quantizer: Converts the
corresponding binary form

samples

into

their

## Encoder Design Data representation

The most significant bit of the codeword
represents the sign of the sample
A binary 0 indicates a positive value and a
binary 1 indicates a negative value
The signal must be sampled at a much higher
rate than the maximum rate of change of the
signal amplitude
The number of quantization levels should be as
large as possible to represent the signal accurately

Sampling Rate
Nyquist sampling theorem: To obtain an accurate
representation of a time-varying analogue signal, its
amplitude must be sampled at a minimum that is
equal to or greater than twice the highest sinusoidal
frequency component that is present in the signal
Nyquist rate is represented either in Hz or more
correctly in samples per seconds (sps)
Antialiasing filter: Another name for bandlimiting
filter. Since it passes frequencies that are within the
Nyquist rate

## In reality the transmission channel used often has a

lower bandwidth
To avoid distortion the source signal is first passed
through the BLF which is designed to pass only the
frequency components that are within the channel
bandwidth
This avoids alias signals caused by undersampling

Quantization Intervals
Representation of the analogue samples require an
infinite number of digits

Quantization Intervals
Three bits are used to represent each sample ( 1 bit for the
sign and two bits to represent the magnitude)
If Vmax is the maximum positive and negative signal
amplitude and n is the number of binary bits used then the
quantization interval, q, is defined as
q = 2Vmax/ 2n
A signal anywhere within the quantization interval will be
represented by the same binary codeword
Each codeword is at the centre of the corresponding
quantization interval
Therefore a difference of q/2 from the actual signal level

## Quantization error is the difference between the

actual signal amplitude and the corresponding
nominal amplitude (also known as quantization
noise since values vary randomly)

Dynamic Range

## With high-fidelity music it is important to be able

to hear very quiet passages without any distortion
created by quantization noise
Dynamic range is defined as the ratio of the
maximum signal amplitude to the minimum.
D = 20 log10 (Vmax/Vmin) dB

Decoder Design

Encoder+decode= Codec

## A signal decoder is an electronic circuit that

performs the conversion prior to their output
back again into their analogue form through a
digital-to-analogue converter and a low pass
filter
Low-pass filter: Only passes those frequency
components that were filtered through the

Text
Unformatted text: Known as plain text; enables pages

## to be created which comprise strings of fixed-sized

characters from a limited character set
Formatted Text: Known as richtext; enables pages to
be created which comprise of strings of characters of
different styles, sizes and shape with tables, graphics,
and images inserted at appropriate points
Hypertext: Enables an integrated set of documents
(Each comprising formatted text) to be created which

## Unformatted Text The basic ASCII character set

Control characters
(Back space, escape,
delete, form feed etc)
Printable characters
(alphabetic, numeric,
and punctuation)
The American Standard Code for Information
Interchange is one of the most widely used character sets
and the table includes the binary codewords used to
represent each character (7 bit binary code)

characters

## The characters in columns 010/011 and 110/111 are

replaced with the set of mosaic characters; and then
used, together with the various uppercase characters
illustrated, to create relatively simple graphical images

## Unformatted Text Examples of

Videotext/Teletext

## Although in practice the total page is made up of a

matrix of symbols and characters which all have the
same size, some simple graphical symbols and text of
larger sizes can be constructed by the use of groups of
the basic symbols

Formatted Text

## It is produced by most word processing packages and

used extensively in the publishing sector for the
preparation of papers, books, magazines, journals and so
on..
Documents of mixed type (characters, different styles,
fonts, shape etc) possible.
Format control characters are used

## Hypertext can be used to create an electronic version of

documents with the index, descriptions of departments,
courses on offer, library, and other facilities all written in
hypertext as pages with various defined hyperlinks

## An example of a hypertext language is HTML used to

describe how the contents of a document are presented
on a printer or a display; other mark-up languages are:
Postscript, SGML (Standard Generalized Mark-up
language) Tex, Latex.

Images
Images include computer-generated images (referred
to as computer graphics or simply graphics) and
digitized images of both documents and pictures
All types of images are displayed in the form of a twodimensional matrix of individual picture elements
(pixels or pels), but represented differently within the
computer memory (file)
Each type of these images is created differently.

Graphics

## of a matrix of 640 horizontal pixels by 480

vertical pixels with for example, 8 bits per pixel
which allows each pixel to have one of 256
different colours

Graphics

## Colouring a solid block

with the same colour is
known as rendering.
All objects are made up of a series of lines that are
connected to each other and, what appear as a curved
line, in practice is a series of short lines each made up
of a string of pixels
Each object has a number of attributes associated
with it. These include its shape, size in terms of pixel
position, colour of the border etc..

Graphics - Conclusions

## - high-level representation (similar to a source

code of a program) requires less memory to store the
image and less bandwidth for transmission
- actual picture image of the graphic ( similar to
the low-level machine code and generally known as
bit-map format) e.g. GIF (graphical interchange
format), TIFF ( tagged image format)
A graphic can be transferred over the network in
either form
A software called SRGP (simple raster graphics
package) - used to convert high-level form into a
pixel-image form

## The scanner associated with fax machines operates

by scanning each complete page from left to right to
produce a sequence of scan lines that start at the top of
the page and end at the bottom
Vertical resolution is either 3.85 (100 lines) or 7.7
mm (200 lines)

## Fax machines uses a single binary digit to represent

each pel, a 0 for a white pel and a 1 for a black pel.
Hence the digital representation of a scanned page
produces a stream about 2 million bits.
Single binary digit per pel means fax machines are
best suited for bitonal images.

## Colour Derivative Principles additive colour

mixing ( R + G + B)

## Black is produced when all three primary colours

(R,G,B) are zero.
Useful for producing a colour image on a black
surface as is the case in display applications

## White is produced when the three chosen

primary colours cyan, magenta and yellow are all
zero.
Useful for producing a colour image on a white
surface as is the case in printing applications.

principles

## The picture tubes used in most television sets

operate using what is known as a raster-scan;
this involves a finely-focussed electron beam
being scanned over the complete screen

## Progressive scanning is performed by repeating

the scanning operation that starts at the top left
corner of the screen and ends at the bottom right
corner follows by the beam being deflected back
again to the top left corner

## The set of three related colour-sensitive

phospors associated with each pixel is called a
phospor triad and the typical arrangement of the
triads on each scan line is shown.

## Frame: Each complete set of horizontal scan lines

(either 525 for North & South America and most of Asia,
or 625 for Europe and other countries)
Flicker: Caused by the previous image fading from the
eye retina before the following image is displayed, after
a low refresh rate ( to avoid this a refresh rate of 50
times per second is required)
Pixel depth: Number of bits per pixel that determines
the range of different colours that can be produced
Colour Look-up Table (CLUT): Table that stores the
selected colours in the subsets as an address to a location
reducing the amount of memory required to store an
image

Digitized Pictures

## Aspect Ratio: This is the ratio of the screen

width to the screen height ( television tubes and
PC monitors have an aspect ratio of 4/3 and wide
screen television is 16/9)

## NTSC = 525 lines per frame (480 Visible)

PAL,CCIR,SECAM=625 lines ( 576 visible)
Example
display
resolutions:
VGA
(640x480x8), XGA (1024x768x8) and SVGA
(1024x768x24)

## 2.4.3 Digitized pictures

Color principles
A whole spectrum of colors known as a color
gamut can be produced by using different
proportions of red(R), green(G), and blue (B)
Fig 2.12
Additive color mixing producing a color image on a
black surface
Subtractive color mixing for producing a color
image on a white surface
Fig 2.13

## 2.4.3 Digitized pictures

Raster-scan principles
Progressive scanning
Each complete set of horizontal scan is
called a frame
The number of bits per pixel is known as
the pixel depth and determines the range
of different colors.

## 2.4.3 Digitized pictures

Aspect ratio
Both the number of pixels per scanned line and
the number of lines per frame
The ratio of the screen width to the screen height
National
Television
Standards
Committee
(NTSC), PAL(UK), CCIR(Germany), SECAM
(France)
Table 2.1

## 2.4.3 Digitized pictures

Digitized Pictures(5)
Example 2.3
Derive the time to transmit the following digitized images at both 64Kbps and
1.5Mbps networks
a 6404808 VGA-compatible image
a 102476824 SVGA-compatible image

Solution
The size of each image in bit is as follows
a VGA image = 6404808 = 2.46Mbits
an SVGA image = 102476824 =18.88Mbits
The time to transmit each image is given as follows
at 64Kbps : VGA = 2.46Mbits/64Kbps = [2.46106]/[64 103] = 38.4 sec.
SVGA = [18.88106]/[64 103] =
295 sec.
at 1.5Mbps: VGA = 2.46Mbits/1.5Mbps = [2.46106]/[1.5 106] = 1.64 sec.
SVGA = [18.88106]/[1.5 106] = 12.59 sec.
52

## 2.4.3 Digitized pictures

Digital cameras and scanners
An image is captured within the
camera/scanner using an image sensor
A two-dimensional grid of light-sensitive
cells called photosites
A widely-used image sensor is a chargecoupled device (CCD)
Fig 2.16

Schematic

## Typical arrangement that is used to capture

and store a digital image produced by a scanner
or a digital camera (either a still camera or a
video camera)

Schematic

## Photosites: Silicon chip which consists of a two

dimensional grid of light-sensitive cells, which stores
the level of intensity of the light that falls on it
Charge-coupled devices (CCD): Image sensor that
converts the level of light intensity on each photosites
into an equivalent electrical charge

2.6 Video
Scanning sequence
It is necessary to use a minimum refresh
rate of 50 times per second to avoid flicker
A refresh rate of 25 times per second is
sufficient
Field:the first comprising only the odd scan
lines and the second the even scan lines

The two field are then integrated together in
the television receiver using a technique
known as interlaced scanning
Fig 2.19
The three main properties of a color source
Brightness
Hue:this represents the actual color of the
source
Saturation:this represents the strength or
vividness of the color

The term luminance is used to refer to the
brightness of a source
The hue and saturation are referred to as its
chrominance
Ys 0.299 Rs 0.587Gs 0.144 Bs

## Where Ys is the amplitude of the luminance

signal and Rs,Gs and Bs are the magnitudes
of the three color component signals

The blue chrominance (Cb), and the red
chrominance (Cr) are then used to
represent hue and saturation
The two color difference signals:

Cb Bs Ys

Cr Rs Ys

In the PAL system, Cb and Cr are referred to as
U and V respectively
PAL : Y 0.299 R 0.587G 0.114 B
U 0.493( B Y )
V 0.877( R Y )

## The NTSC system form two different signals

referred to as I and Q
NTSC : Y 0.299 R 0.587G 0.114 B
I 0.74( R Y ) 0.27( B Y )
Q 0.48( R Y ) 0.41( B Y )

## 2.6.2 Digital video

Eye have shown that the resolution of the eye is
less sensitive for color than it is for luminance
4 2 2 format
The original digitization format used in
Recommendation CCIR-601
A line sampling rate of 13.5MHz for luminance
and 6.75MHz for the two chrominance signals
The number of samples per line is increased to
720

## 2.6.2 Digital video

The corresponding number of samples for each
of the two chrominance signals is 360 samples
per active line
This results in 4Y samples for every 2Cb, and 2Cr
samples
The numbers 480 and 576 being the number of
active (visible) lines in the respective system
Fig. 2.21
Example 2.7

## Figure 2.21 Sample positions

with 4:2:2 digitization format.

## 2.6.2 Digital video

4 2 0 format is used in digital video
Interlaced scanning is used and the
absence of chrominance samples in
alternative lines
The same luminance resolution but half the
chrominance resolution
Fig2.22

## Figure 2.22 Sample

positions in 4:2:0
digitization format.

## 2.6.2 Digital video

525-line system Y 720 480
Cb Cr 360 240

## 625-line systemY 720 480

Cb Cr 360 288
13.5 10 8 2 3.375 10 8 162 Mbps
6

## 2.6.2 Digital video

HDTV formats: the resolution to the newer
16/9 wide-screen tubes can be up to
1920*1152 pixels
The source intermediate format (SIF) give a
picture quality comparable with video
recorders(VCRs)

## 2.6.2 Digital video

The common intermediate format (CIF) for use
in videoconferencing applications
Fig 2.23
The quarter CIF (QCIF) for use in video
telephony applications

Fig 2.24
Table 2.2

## Figure 2.23 Sample

positions for SIF and CIF.

## Figure 2.24 Sample positions for

QCIF.

2.6.3 PC video

2.5 Audio
The bandwidth of a typical speech signal
is from 50Hz through to 10kHz; music
signal from 15 Hz through to 20kHz
The sampling rate: 20ksps (2*10kHz) for
speech and 40ksps (2*20kHz) for music
Music stereophonic (stereo) results in a bit
rate double that of a monaural(mono)
signal
Example 2.4

## 2.5.2 CD-quality audio

Bit rate per channel
=sampling rate*bits per sample
3

44.110 16 705.6kbps

Example 2.5

AUDIO

## TWO TYPES OF AUDIO SIGNALS- SPEECH SIGNALS AND MUSIC

QUALITY AUDIO
AUDIO IS PRODUCED - MICROPHONE / SYNTHESISER
SYNTHESIZER PRODUCES AUDIO IN DIGITAL FORMAT WHICH CAN
STORE IN COMPUTER
PCM SPEECH:
It is a digitization process
Defined in ITU-T Recommendations G.711
PCM CONSISTS OF ENCODER AND DECODER

## IT CONSISTS OF EXPANDER AND COMPRESSOR

AS COMPARED TO EARLIER WHERE LINEAR QUANTIZTION IS USED
NOISE LEVEL SAME FOR BOTH LOUD AND LOW SIGNALS.
AS EAR IS MORE SENSITIVE TO NOISE ON QUITE SIGNALS THAN LOUD
SIGNALS, PCM SYSTEM CONSISTS OF NON-LINEAR QUANTIZATION
WITH NARROW INTERVALS THROUGH COMPRESSOR
AT THE DESTINATION EXPANDER IS USED
THE OVERALL OPERATION IS COMPANDING
BEFORE SAMPLING AND USING ADC, SIGNAL PASSED THROUGH
COMPRESSOR FIRST AND PASSED TO ADC AND QUANTIZED.
AT THE RECEIVER, CODEWORD IS FIRST PASSED TO DAC AND
EXPANDER
TWO COMPRESSOR CHARACTERISTICS A LAW AND MU LAW

## CD- QUALITY AUDIO

STANDARD FOR CD PLAYERS AND CDROMS CD-DA STANDARD
SYNTHESIZED AUDIO:
Synthesized audio uses less memory
It is easier to edit synthesized audio
Mix several passages together
Three components are- computer, keyboard, sound generators
Keyboard sends commands to computer which is sent to sound
generators which produces
Sound waveform via DAC to drive speakers
For each key different codeword known as the message with a
synthesizer keybord is generated and read by the computerprogram
The control panel has switches and sliders which indicate the volume
and sound effects for the prog

## Secondary interface stores audio in secondary

Storage devices
There are programs to allow the users to edit a
previously enterred passage or mix several stored
passages together
There is a range of other inputs from instruments
To discriminate between inputs from different possible
sources a standard messages are defined for
corresponding sound generators
These are defined in a standard music instrument
digital interface-MIDI
It defines format of standardized set of messages used
by synthesizer, types of connectors,cables and
electrical signals .

Example 1

Example 2

Example 3

Example 4

Example 5

Example 6

Chapter 1- Example-1

Chapter 1 example 2