You are on page 1of 120

Chapter 2

Multimedia Information Representation

Multimedia information
representation

2.1 Introduction
2.2 Digitization principles
2.3 Text
2.4 Images
2.5 Audio
2.6 Video

Introduction
The conversion of an analog signal into a
digital form
Signal encoder, sampling, signal decoder

2.2 Digitization principles


2.2.1 Analog signals
Fourier analysis can be used to show that any
time-varying analog signal is made up of a
possibly infinite number of single-frequency
sinusoidal signals whose amplitude and phase
vary continuously with time relative to each other
Signal bandwidth
Fig2.1
The bandwidth of the transmission channel
should be equal to or greater than the bandwidth
of the signalbandlimiting channel

Multimedia Information Representation


Multimedia Information is stored and processed
within a computer in a digital form.
Codeword: Combination of a fixed number of bits that
represents each character, in the case of textual
information.
Analog signal: Signal whose amplitude (magnitude of
the sound/image intensity) varies continuously with
time.
Signal encoder: Electrical circuit used for the
conversion of an analog signal into a digital form.
Signal decoder: Electrical circuit that converts stored
digitized samples into time-varying analogue form.

Analog Signals
As mentioned earlier the amplitude of the signal
varies continuously with time
The Fourier analysis can be used to show that any
time varying signal is made up of infinite number of
single-frequency sinusoidal components
The range of frequencies of the sinusoidal
components that make up the signal is called the
signal bandwidth
Speech bandwidth: 50Hz 10kHz
Music Bandwidth: 15Hz 20kHz

Analog Signals Signal Properties

Analogue Signals Signal Properties

To transmit an analogue signal through a network


the bandwidth of the transmission channel should be
equal to or greater than the signal bandwidth
If the bandwidth of the channel is less than the
signal bandwidth than channel is called the
bandlimiting channel

Encoder Design

2.2.2 Encoder design


A bandlimiting filter and an analog-to-digital
converter(ADC), the latter comprising a sampleand-hold and a quantizer
Fig2.2
Remove selected higher-frequency components
from the source signal (A)
(B) is then fed to the sample-and-hold circuit
Sample the amplitude of the filtered signal at
regular time intervals (C) and hold the sample
amplitude constant between samples (D)

2.2.2 Encoder design


Quantizer circuit which converts each
sample amplitude into a binary value
known as a codeword (E)
The signal to be sampled at a rate which
is higher than the maximum rate of
change of the signal amplitude
The number of different quantization
levels used to be as large as possible

2.2.2 Encoder design


Nyquist sampling theorem states that:
in order to obtain an accurate
representation of a time-varying analog
signal, its amplitude must be sampled
at a minimum rate that is equal to or
greater than twice the highest
sinusoidal frequency component that is
present in the signal.

Encoder Design

Bandlimiting filter: Removes the selected higher


frequency components from the source signal
Sample and hold Circuit: Samples amplitude of the
filtered signal at regular intervals and holds the
sampled amplitudes between samples
Quantizer: Converts the
corresponding binary form

samples

into

their

Encoder Design Data representation


The most significant bit of the codeword
represents the sign of the sample
A binary 0 indicates a positive value and a
binary 1 indicates a negative value
The signal must be sampled at a much higher
rate than the maximum rate of change of the
signal amplitude
The number of quantization levels should be as
large as possible to represent the signal accurately

Sampling Rate
Nyquist sampling theorem: To obtain an accurate
representation of a time-varying analogue signal, its
amplitude must be sampled at a minimum that is
equal to or greater than twice the highest sinusoidal
frequency component that is present in the signal
Nyquist rate is represented either in Hz or more
correctly in samples per seconds (sps)
Antialiasing filter: Another name for bandlimiting
filter. Since it passes frequencies that are within the
Nyquist rate

Alias signal generation due to undersampling

In reality the transmission channel used often has a


lower bandwidth
To avoid distortion the source signal is first passed
through the BLF which is designed to pass only the
frequency components that are within the channel
bandwidth
This avoids alias signals caused by undersampling

Quantization Intervals
Representation of the analogue samples require an
infinite number of digits

Quantization Intervals
Three bits are used to represent each sample ( 1 bit for the
sign and two bits to represent the magnitude)
If Vmax is the maximum positive and negative signal
amplitude and n is the number of binary bits used then the
quantization interval, q, is defined as
q = 2Vmax/ 2n
A signal anywhere within the quantization interval will be
represented by the same binary codeword
Each codeword is at the centre of the corresponding
quantization interval
Therefore a difference of q/2 from the actual signal level

Quantization noise polarity

Quantization error is the difference between the


actual signal amplitude and the corresponding
nominal amplitude (also known as quantization
noise since values vary randomly)

Dynamic Range

With high-fidelity music it is important to be able


to hear very quiet passages without any distortion
created by quantization noise
Dynamic range is defined as the ratio of the
maximum signal amplitude to the minimum.
D = 20 log10 (Vmax/Vmin) dB

Decoder Design

Encoder+decode= Codec

A signal decoder is an electronic circuit that


performs the conversion prior to their output
back again into their analogue form through a
digital-to-analogue converter and a low pass
filter
Low-pass filter: Only passes those frequency
components that were filtered through the

Text
Unformatted text: Known as plain text; enables pages

to be created which comprise strings of fixed-sized


characters from a limited character set
Formatted Text: Known as richtext; enables pages to
be created which comprise of strings of characters of
different styles, sizes and shape with tables, graphics,
and images inserted at appropriate points
Hypertext: Enables an integrated set of documents
(Each comprising formatted text) to be created which
have defined linkages between them

Unformatted Text The basic ASCII character set


Control characters
(Back space, escape,
delete, form feed etc)
Printable characters
(alphabetic, numeric,
and punctuation)
The American Standard Code for Information
Interchange is one of the most widely used character sets
and the table includes the binary codewords used to
represent each character (7 bit binary code)

Unformatted Text Supplementary set of Mosaic


characters

The characters in columns 010/011 and 110/111 are


replaced with the set of mosaic characters; and then
used, together with the various uppercase characters
illustrated, to create relatively simple graphical images

Unformatted Text Examples of


Videotext/Teletext

Although in practice the total page is made up of a


matrix of symbols and characters which all have the
same size, some simple graphical symbols and text of
larger sizes can be constructed by the use of groups of
the basic symbols

Formatted Text

It is produced by most word processing packages and


used extensively in the publishing sector for the
preparation of papers, books, magazines, journals and so
on..
Documents of mixed type (characters, different styles,
fonts, shape etc) possible.
Format control characters are used

Hypertext Electronic Document in hypertext

Hypertext can be used to create an electronic version of


documents with the index, descriptions of departments,
courses on offer, library, and other facilities all written in
hypertext as pages with various defined hyperlinks

Hypertext Electronic Document in hypertext

An example of a hypertext language is HTML used to


describe how the contents of a document are presented
on a printer or a display; other mark-up languages are:
Postscript, SGML (Standard Generalized Mark-up
language) Tex, Latex.

Images
Images include computer-generated images (referred
to as computer graphics or simply graphics) and
digitized images of both documents and pictures
All types of images are displayed in the form of a twodimensional matrix of individual picture elements
(pixels or pels), but represented differently within the
computer memory (file)
Each type of these images is created differently.

Graphics

VGA is a common type of display that consists

of a matrix of 640 horizontal pixels by 480


vertical pixels with for example, 8 bits per pixel
which allows each pixel to have one of 256
different colours

Graphics

Colouring a solid block


with the same colour is
known as rendering.
All objects are made up of a series of lines that are
connected to each other and, what appear as a curved
line, in practice is a series of short lines each made up
of a string of pixels
Each object has a number of attributes associated
with it. These include its shape, size in terms of pixel
position, colour of the border etc..

Graphics - Conclusions

There are two forms of representation

- high-level representation (similar to a source


code of a program) requires less memory to store the
image and less bandwidth for transmission
- actual picture image of the graphic ( similar to
the low-level machine code and generally known as
bit-map format) e.g. GIF (graphical interchange
format), TIFF ( tagged image format)
A graphic can be transferred over the network in
either form
A software called SRGP (simple raster graphics
package) - used to convert high-level form into a
pixel-image form

Digitized Documents- Fax Principles

The scanner associated with fax machines operates


by scanning each complete page from left to right to
produce a sequence of scan lines that start at the top of
the page and end at the bottom
Vertical resolution is either 3.85 (100 lines) or 7.7
mm (200 lines)

Digitized Documents- Digitization format

Fax machines uses a single binary digit to represent


each pel, a 0 for a white pel and a 1 for a black pel.
Hence the digital representation of a scanned page
produces a stream about 2 million bits.
Single binary digit per pel means fax machines are
best suited for bitonal images.

Colour Derivative Principles additive colour


mixing ( R + G + B)

Black is produced when all three primary colours


(R,G,B) are zero.
Useful for producing a colour image on a black
surface as is the case in display applications

Digitised Pictures- Subtractive colour mixing

White is produced when the three chosen


primary colours cyan, magenta and yellow are all
zero.
Useful for producing a colour image on a white
surface as is the case in printing applications.

Digitized Pictures- Television/computer monitor


principles

The picture tubes used in most television sets


operate using what is known as a raster-scan;
this involves a finely-focussed electron beam
being scanned over the complete screen

Digitized Pictures- Raster Scan

Progressive scanning is performed by repeating


the scanning operation that starts at the top left
corner of the screen and ends at the bottom right
corner follows by the beam being deflected back
again to the top left corner

Digitized Pictures Raster scan display architecture

Digitized Pictures-Pixel format on each scan

The set of three related colour-sensitive


phospors associated with each pixel is called a
phospor triad and the typical arrangement of the
triads on each scan line is shown.

Digitized Pictures Concepts

Frame: Each complete set of horizontal scan lines


(either 525 for North & South America and most of Asia,
or 625 for Europe and other countries)
Flicker: Caused by the previous image fading from the
eye retina before the following image is displayed, after
a low refresh rate ( to avoid this a refresh rate of 50
times per second is required)
Pixel depth: Number of bits per pixel that determines
the range of different colours that can be produced
Colour Look-up Table (CLUT): Table that stores the
selected colours in the subsets as an address to a location
reducing the amount of memory required to store an
image

Digitized Pictures

Aspect Ratio: This is the ratio of the screen


width to the screen height ( television tubes and
PC monitors have an aspect ratio of 4/3 and wide
screen television is 16/9)

Digitized Pictures Screen Resolutions

NTSC = 525 lines per frame (480 Visible)


PAL,CCIR,SECAM=625 lines ( 576 visible)
Example
display
resolutions:
VGA
(640x480x8), XGA (1024x768x8) and SVGA
(1024x768x24)

2.4.3 Digitized pictures


Color principles
A whole spectrum of colors known as a color
gamut can be produced by using different
proportions of red(R), green(G), and blue (B)
Fig 2.12
Additive color mixing producing a color image on a
black surface
Subtractive color mixing for producing a color
image on a white surface
Fig 2.13

2.4.3 Digitized pictures


Raster-scan principles
Progressive scanning
Each complete set of horizontal scan is
called a frame
The number of bits per pixel is known as
the pixel depth and determines the range
of different colors.

2.4.3 Digitized pictures


Aspect ratio
Both the number of pixels per scanned line and
the number of lines per frame
The ratio of the screen width to the screen height
National
Television
Standards
Committee
(NTSC), PAL(UK), CCIR(Germany), SECAM
(France)
Table 2.1

2.4.3 Digitized pictures

Digitized Pictures(5)
Example 2.3
Derive the time to transmit the following digitized images at both 64Kbps and
1.5Mbps networks
a 6404808 VGA-compatible image
a 102476824 SVGA-compatible image

Solution
The size of each image in bit is as follows
a VGA image = 6404808 = 2.46Mbits
an SVGA image = 102476824 =18.88Mbits
The time to transmit each image is given as follows
at 64Kbps : VGA = 2.46Mbits/64Kbps = [2.46106]/[64 103] = 38.4 sec.
SVGA = [18.88106]/[64 103] =
295 sec.
at 1.5Mbps: VGA = 2.46Mbits/1.5Mbps = [2.46106]/[1.5 106] = 1.64 sec.
SVGA = [18.88106]/[1.5 106] = 12.59 sec.
52

2.4.3 Digitized pictures


Digital cameras and scanners
An image is captured within the
camera/scanner using an image sensor
A two-dimensional grid of light-sensitive
cells called photosites
A widely-used image sensor is a chargecoupled device (CCD)
Fig 2.16

Digitized Pictures Colour Image Capture:


Schematic

Typical arrangement that is used to capture


and store a digital image produced by a scanner
or a digital camera (either a still camera or a
video camera)

Digitized Pictures Colour Image Capture:


Schematic

Photosites: Silicon chip which consists of a two


dimensional grid of light-sensitive cells, which stores
the level of intensity of the light that falls on it
Charge-coupled devices (CCD): Image sensor that
converts the level of light intensity on each photosites
into an equivalent electrical charge

2.6 Video
2.6.1 Broadcast television
Scanning sequence
It is necessary to use a minimum refresh
rate of 50 times per second to avoid flicker
A refresh rate of 25 times per second is
sufficient
Field:the first comprising only the odd scan
lines and the second the even scan lines

2.6.1 Broadcast television


The two field are then integrated together in
the television receiver using a technique
known as interlaced scanning
Fig 2.19
The three main properties of a color source
Brightness
Hue:this represents the actual color of the
source
Saturation:this represents the strength or
vividness of the color

2.6.1 Broadcast television


The term luminance is used to refer to the
brightness of a source
The hue and saturation are referred to as its
chrominance
Ys 0.299 Rs 0.587Gs 0.144 Bs

Where Ys is the amplitude of the luminance


signal and Rs,Gs and Bs are the magnitudes
of the three color component signals

2.6.1 Broadcast television


The blue chrominance (Cb), and the red
chrominance (Cr) are then used to
represent hue and saturation
The two color difference signals:

Cb Bs Ys

Cr Rs Ys

2.6.1 Broadcast television


In the PAL system, Cb and Cr are referred to as
U and V respectively
PAL : Y 0.299 R 0.587G 0.114 B
U 0.493( B Y )
V 0.877( R Y )

The NTSC system form two different signals


referred to as I and Q
NTSC : Y 0.299 R 0.587G 0.114 B
I 0.74( R Y ) 0.27( B Y )
Q 0.48( R Y ) 0.41( B Y )

2.6.2 Digital video


Eye have shown that the resolution of the eye is
less sensitive for color than it is for luminance
4 2 2 format
The original digitization format used in
Recommendation CCIR-601
A line sampling rate of 13.5MHz for luminance
and 6.75MHz for the two chrominance signals
The number of samples per line is increased to
720

2.6.2 Digital video


The corresponding number of samples for each
of the two chrominance signals is 360 samples
per active line
This results in 4Y samples for every 2Cb, and 2Cr
samples
The numbers 480 and 576 being the number of
active (visible) lines in the respective system
Fig. 2.21
Example 2.7

Figure 2.21 Sample positions


with 4:2:2 digitization format.

2.6.2 Digital video


4 2 0 format is used in digital video
broadcast applications
Interlaced scanning is used and the
absence of chrominance samples in
alternative lines
The same luminance resolution but half the
chrominance resolution
Fig2.22

Figure 2.22 Sample


positions in 4:2:0
digitization format.

2.6.2 Digital video


525-line system Y 720 480
Cb Cr 360 240

625-line systemY 720 480


Cb Cr 360 288
13.5 10 8 2 3.375 10 8 162 Mbps
6

2.6.2 Digital video


HDTV formats: the resolution to the newer
16/9 wide-screen tubes can be up to
1920*1152 pixels
The source intermediate format (SIF) give a
picture quality comparable with video
recorders(VCRs)

2.6.2 Digital video


The common intermediate format (CIF) for use
in videoconferencing applications
Fig 2.23
The quarter CIF (QCIF) for use in video
telephony applications

Fig 2.24
Table 2.2

Figure 2.23 Sample


positions for SIF and CIF.

Figure 2.24 Sample positions for


QCIF.

2.6.3 PC video

2.5 Audio
The bandwidth of a typical speech signal
is from 50Hz through to 10kHz; music
signal from 15 Hz through to 20kHz
The sampling rate: 20ksps (2*10kHz) for
speech and 40ksps (2*20kHz) for music
Music stereophonic (stereo) results in a bit
rate double that of a monaural(mono)
signal
Example 2.4

2.5.2 CD-quality audio


Bit rate per channel
=sampling rate*bits per sample
3

44.110 16 705.6kbps

Total bit rate = 2*705.6=1.411Mbps


Example 2.5

AUDIO

TWO TYPES OF AUDIO SIGNALS- SPEECH SIGNALS AND MUSIC


QUALITY AUDIO
AUDIO IS PRODUCED - MICROPHONE / SYNTHESISER
SYNTHESIZER PRODUCES AUDIO IN DIGITAL FORMAT WHICH CAN
STORE IN COMPUTER
PCM SPEECH:
It is a digitization process
Defined in ITU-T Recommendations G.711
PCM CONSISTS OF ENCODER AND DECODER

IT CONSISTS OF EXPANDER AND COMPRESSOR


AS COMPARED TO EARLIER WHERE LINEAR QUANTIZTION IS USED
NOISE LEVEL SAME FOR BOTH LOUD AND LOW SIGNALS.
AS EAR IS MORE SENSITIVE TO NOISE ON QUITE SIGNALS THAN LOUD
SIGNALS, PCM SYSTEM CONSISTS OF NON-LINEAR QUANTIZATION
WITH NARROW INTERVALS THROUGH COMPRESSOR
AT THE DESTINATION EXPANDER IS USED
THE OVERALL OPERATION IS COMPANDING
BEFORE SAMPLING AND USING ADC, SIGNAL PASSED THROUGH
COMPRESSOR FIRST AND PASSED TO ADC AND QUANTIZED.
AT THE RECEIVER, CODEWORD IS FIRST PASSED TO DAC AND
EXPANDER
TWO COMPRESSOR CHARACTERISTICS A LAW AND MU LAW

CD- QUALITY AUDIO


STANDARD FOR CD PLAYERS AND CDROMS CD-DA STANDARD
SYNTHESIZED AUDIO:
Synthesized audio uses less memory
It is easier to edit synthesized audio
Mix several passages together
Three components are- computer, keyboard, sound generators
Keyboard sends commands to computer which is sent to sound
generators which produces
Sound waveform via DAC to drive speakers
For each key different codeword known as the message with a
synthesizer keybord is generated and read by the computerprogram
The control panel has switches and sliders which indicate the volume
and sound effects for the prog

Secondary interface stores audio in secondary


Storage devices
There are programs to allow the users to edit a
previously enterred passage or mix several stored
passages together
There is a range of other inputs from instruments
To discriminate between inputs from different possible
sources a standard messages are defined for
corresponding sound generators
These are defined in a standard music instrument
digital interface-MIDI
It defines format of standardized set of messages used
by synthesizer, types of connectors,cables and
electrical signals .

Example 1

Example 2

Example 3

Example 4

Example 5

Example 6

Chapter 1- Example-1

Chapter 1 example 2