Multimedia - ECC

PAPER 2017
Ques 1(a) What is Animation? How is it different from Images?

Animation : Animation the process of designing, drawing, making layouts and
preparation of presentation, through animation we can move pictures in motion, we can
make videos also. We can capture movements of pictures through mobiles, cameras or
any digital media. Animations are created from a sequence of still images, each image
is slightly changed from the previous one. Animation has become very famous since the
1950s. nowadays there are 2D animation movies and 3D animation movies. Through
animation we can develop your web sites. Animation means putting life on non living
things.
Types of animation:
● Traditional animation.
● Stop motion animation.
● Motion graphics.
● Computer animation.
● 2D animation.
● 3D animation.
Images are a media type displayed as visual information. They can be drawings,
paintings or photographs. Images are used to create interest and provide information.
Photographs and other types of graphical data are designed specifically for display. An
image on a screen is made up of dots called pixels. A pixel is the smallest part of the
screen that can be controlled by the computer or other device. The total number of
pixels on a screen is called its resolution. (ie New iPad has Retina display, 2048 x 1536
resolution). An image can be represented in two different ways. Either a Bitmap or a
Vector. Typical file formats for a bitmap can be JPEG, GIF, PNG and BMP. Vector
images can be SVG, WMF and EMF.
An image consists of a rectangular array of dots called pixels. The size of the image is
specified in terms of width X height, in numbers of the pixels. The physical size of the
image, in inches or centimeters, depends on the resolution of the device on which the
image is displayed. The resolution is usually measured in DPI (Dots Per Inch). An
image will appear smaller on a device with a higher resolution than on one with a
lower resolution. For color images, one needs enough bits per pixel to represent all the
colors in the image. The number of the bits per pixel is called the depth of the image.
Que 1(b) What is Sound?
We can describe the sound as a form of energy that is caused by the vibration of
objects. Vibration is rapid to and from the motion of any object or particle. We sense
this vibration in our ears as a sound.
How is Sound Produced and Transmitted?

We saw that vibrations in an object produce sound. The air particles near the object
vibrate and transfer this energy all around and away from the source in the form of
waves. Sound waves travel through solids, liquids and gasses. From a point source,
these waves travel as a series of concentric spheres. Once it reaches our ears, the
waves hit the eardrum, causing it to vibrate. A unique mechanism in our ears carries
this sensation to the brain to make the signal understandable.
Characteristics of Sound:
Sound has specific characteristics that define the way we hear it. Some of the major
characteristics of sound are as follows:
Wavelength
It is the length between two adjacent areas of compression or rarefaction.In transverse
waves, it is the length between two adjacent peaks or two adjacent troughs.
Pitch
Pitch is a characteristic of sound by which a correct note can be distinguished from a
grave or a flat note. We can identify a female and male voice without seeing them.
The term ‘pitch’ is often used in music. Pitch depends upon the frequencies of the
sound wave. A note has a higher pitch when the frequency is high and a note of low
frequency has a low pitch. For example, when a small baby speaks something,
his/her voice has a higher frequency so in case of a baby the pitch is higher than the
pitch of a man. The sound with a high frequency is called shrill.
Loudness
The loudness is a sensation of how strong a sound wave is at a place. It is always a
relative term and is a dimensionless quantity. Loudness is measured in decibel (dB). It
is given as:
L = log(I), here ‘I’ is the intensity.
1
The loudness depends on the amplitude of the vibration. It will be louder when the
amplitude is high. Suppose when we pluck a string of the sitar it starts vibrating with low
amplitude and if we apply more energy by plucking more strongly, the string will vibrate
with the greater amplitude and produce a loud sound. As the amplitude of vibration
increases, sound also increases.
Loudness is the measure of the intensity or strength of the sound waves. In waves, it is
another term for amplitude. Amplitude is the amount of displacement of the particles as
they vibrate in the medium. The more the displacement, the more is the amplitude; that
is, louder is the sound.
Pluck the string of a guitar lightly. It vibrates with less amplitude giving out a low sound.
Pluck it with more force. The sound is louder due to its higher amplitude.
Quality
The word timbre also describes the term quality. As different sources produce
different sounds, the timbre helps us to distinguish between them. A sound of good
quality is pleasant to listen to. The instruments are of different shapes and sizes and
they produce different harmonics of loudness hence their sound can be easily
distinguished.
Timbre
In a musical concert, we can clearly hear the different sounds coming from the various
musical instruments. For example, we can distinguish between the piano, drum, sitar,
clarinet or flute. How is it possible?
It is possible because of a property of sound called timbre. In simple words, timbre
means the quality of the sound. In a picture, each part has its own color, and our eyes
can distinguish between them. Therefore, timbre is equivalent to color. Therefore, it is
also called tone color.
Two sounds from different sources can have the same pitch (frequency) and loudness
(amplitude). But we can distinguish between them because of the difference in their
timbres. This is because each sound has its own waveform or the shape of its wave. A
waveform is formed by mixing up waves of different frequencies. So every object will
create a sound with its own waveform. For example, the waveform of a flute is different
from that of a veena.
Noise and Music

Both noise and music are sounds. So why do we like one and dislike the other?
Music has an organized structure in its waveform. This makes it pleasant to listen to.
We can define noise as a random set of waves that causes unpleasantness when we
2
hear it. Multiple waves of different frequencies and amplitudes get mixed up, giving a
jarring effect when we listen to it. Sounds of machines, traffic, and crowded places are
examples of sources of noise. When this disorganized sound becomes too loud, it is
noise pollution and may lead to health disorders.
Q2 a) Explain SCSI and MCI

Scsi
SCSI stands for Small Computer System Interface, and it is a standard interface used
for connecting and transferring data between computers and peripheral devices. In the
context of multimedia, SCSI has been commonly used for connecting various
multimedia devices to computers, such as scanners, optical drives, hard disk drives, and
tape drives. It provides a high-speed and reliable data transfer capability, making it
suitable for multimedia applications that require large amounts of data to be transferred
quickly.
Types of SCSI
1. SCSI-1: SCSI-1 features an 8-bit data bus, allowing for the parallel transfer of 8
bits of data simultaneously. It supports a data transfer rate of up to 5 MB/s.
2. SCSI-2: SCSI-2 maintains an 8-bit data bus similar to SCSI-1 and offers backward
compatibility. It supports data transfer rates of up to 10 MB/s.
3. Fast SCSI: Fast SCSI, also known as SCSI-2 Fast, operates with an 8-bit data bus
and provides increased data transfer rates ranging from 10 MB/s to 20 MB/s.
4. Wide SCSI: Wide SCSI expands the data bus width to 16 bits, effectively doubling
the data transfer rate compared to Fast SCSI. It supports data rates of up to 20
MB/s or 40 MB/s.
5. Ultra SCSI: Ultra SCSI retains the 8-bit data bus but introduces low-voltage
differential signaling (LVDS) to improve signal quality and noise reduction. It
offers data transfer rates of up to 20 MB/s.
6. Ultra Wide SCSI: Ultra Wide SCSI combines the wider 16-bit data bus of Wide
SCSI with the higher data transfer rates of Ultra SCSI, resulting in speeds of up to
40 MB/s.
7. Ultra2 SCSI: Ultra2 SCSI supports an 8-bit data bus and employs LVD/SE (Low
Voltage Differential/Single Ended) signaling. It achieves data transfer rates of 40
MB/s to 80 MB/s.
8. Ultra3 SCSI: Ultra3 SCSI, also known as Ultra160 SCSI, operates with an 8-bit data
bus and offers data transfer rates of up to 160 MB/s.
9. Ultra320 SCSI: Ultra320 SCSI maintains the 8-bit data bus and delivers data
transfer rates of up to 320 MB/s.
3
10. Serial Attached SCSI (SAS): SAS is a newer SCSI standard that uses a serial data
transfer method. It offers higher data rates and improved scalability compared to
parallel SCSI. The data bus width for SAS can vary, with 1, 2, 4, or 8 lanes
available.
Features of SCSI:
● High-speed data transfer: SCSI interfaces offer faster data transfer rates
compared to older interfaces like parallel ports.
● Wide device support: SCSI supports a wide range of peripheral devices, including
hard drives, optical drives, tape drives, scanners, and more.
● Daisy-chaining: SCSI allows multiple devices to be connected in a chain or bus,
simplifying the connectivity of multiple peripherals.
● Command set: SCSI has a versatile command set that enables advanced
functionality and control over connected devices.
● Flexibility: SCSI interfaces can support a variety of data transfer modes and
configurations, allowing for flexible device connectivity.
Drawbacks of SCSI:
● Cost: SCSI devices and cables can be more expensive compared to other
interface options.
● Cable length and flexibility: SCSI cables have limitations in terms of length and
flexibility, which can be challenging in certain setups.
● Device limitations: SCSI has limitations on the maximum number of devices that
can be connected in a chain.
● Complexity: SCSI can be more complex to set up and configure compared to
other interface standards.
● Compatibility: Compatibility issues may arise when connecting different SCSI
devices or when connecting SCSI devices with newer computer systems.
● Reduced popularity and support: The demand for SCSI has declined with the
emergence of newer interface technologies, leading to reduced availability and
support.
MCI
In the context of multimedia, MCI stands for Media Control Interface. MCI is a
programming interface that provides a standardized way to control and manage
multimedia devices and their associated resources, such as sound cards, CD-ROM
4
drives, video capture cards, and MIDI devices. It was developed by Microsoft as part of
the Windows operating system.
MCI provides a set of commands and functions that allow applications to perform
various multimedia operations, including playback, recording, seeking, volume control,
and device configuration. By using MCI, multimedia applications can interact with
different multimedia devices in a consistent and platform-independent manner.
MCI Devices
The Media Control Interface consists of 4 parts:
● cdaudio
● Digital video
● overlay
● sequencer
● VCR
● Video disc
● waveaudio
Each of these so-called MCI devices (e.g. CD-ROM or VCD player) can play a certain type of files,
e.g. AVIVideo plays .avi files, CDAudio plays CD-DA tracks among others. Other MCI devices have
also been made available over time.
Some basic commands

Close, Play, Pause, Stop, Back,Prev, Next, Seek, Record, Eject
Q2 b) Explain human speech production mechanism
5
The physiology of speech:-
(i) Speech signal is generated when air expelled from the lungs acoustically excites the
vocal cords producing a sequence sounds.
(ii) Lungs along with the diaphragm, are the main source of production for speech
signals.
(iii) There are 3 main cavities of vocal tract:
(a) Pharynx (b) Oral cavity (c) Nasal cavity
(iv) While speaking the air gushes through the V-shaped opening called glottis and the
larynx to the vocal tract.
(v) Its basic function is to manipulate the airflow by swiftly opening and closing the
valves in the vocal cord, producing variety of sounds.
(vi) The vibration frequency is dependent on mass and tension. It varies from person to
person.
(vii) From the nasal cavity to the pharynx, soft palate acts as a medium for connecting
and isolating them from each other. The lower end of pharynx consists of Epiglottis and
6
false vocal cord, to prevent the food from entering the larynx and are closed during
swallowing, while opening during respiration.
(viii) Acoustics can vary with the help of moving our lips, tongue, palate, teeth and
cheeks. It also depends on their size and shape. The walls and construction in the vocal
tract also generates sound.
(ix) Organs responsible for speech are lungs, larynx and vocal tract, where the lungs
provides the larynx with the airflow, while the larynx then modifies it to produce noisy
wind-flow for vocal tract.
(x) Periodic, noisy and impulsive source are the basic categories of sound, where they
generally are used in combinations. Example: the word 'shop' uses all three sources,
where |sh| is noisy, |o| periodic and |p| impulsive.
Que 3 a) Explain Multimedia Architecture?
7
While focusing on multimedia system architecture, the most important point under
consideration is on multimedia applications.
• Applications can be training app, conferencing app, messaging app, education app.
8
9
10
3 b) what do you understand by multimedia files synchronization?
Lecture 11 - Synchronization in Multimedia Systems
Q4 a) explain digital model of speech recognition

Speech production is modeled as an excitation source that is passed through a linear
digital filter. The excitation source represents either voiced or unvoiced speech, and the
filter models the effect produced by the vocal tract on the signal. Within the general
class of source-filter models, there are two basic synthesis techniques: formant
synthesis and linear predictive synthesis.
Formant synthesis models the vocal tract as a digital filter with resonators and anti
resonators These systems use a low-pass filtered periodic pulse train as a source for
voiced signals and a random noise generator as an unvoiced source. A mixture of the
two sources can also be used for speech units that have both voiced and unvoiced
properties. Rules are created to specify the time varying values for the control
11
parameters for the filter and excitation. probably the most used commercial synthesis
system. Linear predictive synthesis uses linear predictive coding to represent the output
signal. Each speech sample is represented as a linear combination of the N previous
samples plus an additive excitation term. As in formant synthesis, the excitation term
uses a pulse train for a voiced signal and noise for un-voiced.
While the source-filter-based systems are capable of producing quite intelligible speech,
they have a distinct mechanical sound, and would not be mistaken for a human. This
quality arises from the simplifying assumptions made by the model and therefore is not
easily remedied.
Speech production model
In order to synthesize speech sounds artificially; we need a model of the speech
production system.
b) what is speech compression? explain low bit rate speech compression.
Speech compression, also known as speech coding or voice compression, refers to the
process of reducing the amount of data required to represent speech signals. It involves
encoding speech information in a compressed format so that it can be efficiently
transmitted, stored, or processed while maintaining an acceptable level of speech
quality.
Speech compression is necessary because raw speech signals typically contain a large
amount of redundant and irrelevant information. By applying various compression
techniques, it becomes possible to significantly reduce the data size of speech signals
without significantly degrading their intelligibility and quality.
The compression process involves two main steps: encoding and decoding.
1. Encoding: In this step, the speech signal is analyzed and transformed into a
compressed representation. Various techniques are employed to reduce redundancy
12
and exploit the characteristics of human speech perception. This includes methods such
as predictive coding, transform coding, quantization, and psychoacoustic modeling. The
encoded representation typically requires fewer bits than the original signal.
2. Decoding: The compressed representation is then decoded at the receiver or

playback end to reconstruct the speech signal. The decoding process reverses the
compression techniques applied during encoding, allowing the original speech signal to
be reproduced as closely as possible.
Speech compression finds application in various domains, including

telecommunications, mobile communication, multimedia streaming, voice recording, and
storage.
Low bit rate speech compression specifically focuses on compressing speech signals at
low data rates, typically ranging from a few kilobits per second (kbps) to a few tens of
kilobits per second. This type of compression is often employed in various applications
such as telephony, voice over IP (VoIP), mobile communication, and streaming services
where bandwidth or storage limitations exist.
Low bit rate speech compression algorithms employ various techniques to reduce the
data size of speech signals. Here are a few commonly used methods:
1. Source coding: Source coding techniques exploit the statistical properties of speech
signals to remove redundancy. This includes methods like predictive coding, transform coding,
and vector quantization. These techniques analyze the input speech signal, identify predictable
patterns, and represent them using fewer bits.
2. Psychoacoustic modeling: Human perception of sound allows for certain imperceptible

distortions. Psychoacoustic models take advantage of this by removing or reducing components
of the speech signal that are less perceptually important. By removing these components, the
overall data size can be reduced without a significant impact on perceived quality.
3. Variable bit rate coding: Instead of using a fixed data rate for compression, variable
bit rate coding allocates more bits to complex or critical parts of the speech signal while
using fewer bits for less important sections. This adaptive allocation allows for efficient
use of available data rate, optimizing the overall quality.
4. Speech codecs: Speech codecs are specifically designed algorithms that combine
several compression techniques to achieve efficient speech compression. Common low
bit rate speech codecs include G.729, GSM, and Speex.
13
By applying these techniques, low bit rate speech compression algorithms can achieve
significant data reduction while maintaining reasonable speech quality.
Q5)a) what are composite video signals?

A composite video signal contains video picture information for color, brightness, and
synchronization (horizontal and vertical).
Composite video signal comprises of a camera signal relating to the desired picture
data, blanking pulses to make the retrace invisible, and synchronizing pulses to
synchronize the transmitter and receiver scanning. A horizontal synchronizing (sync)
pulse is required toward the finish of every active line period whereas a vertical sync
pulse is required after each field is scanned.
In other words the picture information is not transmitted alone. They carries the various
signals along with such as the blanking pulses to make the retrace invisible and
synchronizing pulses to synchronize the scanning at the transmitter and at the receiver.
All the components all together are known as Composite Video Signal (CVS).
Composite video signal consists of
A camera signal corresponding to the desired picture information,

Blanking pulses to make the retrace invisible and
Sync pulses to synchronize the Tx & Rx scanning.
Composite video signal can be represented either with positive polarity or with negative
polarity.
Positive Polarity
In the case of Positive polarity, whiter the scene, higher is the amplitude of the video
signal. Blanking level is kept at the zero level. Below zero level is the sync pulse. Sync
top is at the most negative point as shown in figure below.
In case when the video signal is produced by photo conduction type camera tubes,
bright white light gives a high amplitude of video signal. At the receiver; for reproducing
white on the fluorescent screen, a stronger signal is needed and for reproducing black,
zero signals are needed,
Positive polarity
Positive polarity
Negative Polarity
In case of negative polarity, brighter the scene, smaller is the amplitude. Here sync
pulse is positive, that is above the blanking level. Black is just below the blanking level.
Brighter the scene, lower is its level below the blanking level. White is near the bottom.
14
Negative polarity
Negative polarity
The modulation of the RF carrier by the CVS is in the form of negative AM, where bright
picture points correspond to low carrier amplitude and the sync pulse to maximum
carrier amplitude. This type of modulation is called Negative Modulation.
In most of the TV systems, negative polarity is used to modulate the video carrier.
Advantages of Negative Modulation

In the case of negative modulation, the noise pulses in the transmitted signal shall
increase the amplitude of the carrier which will move towards the black. Thus noise
falling in dark grey or black region will not cause irritation. But in positive modulation,
white blobs will be caused in the picture due to random noise signals.
In negative modulation the voltage of the blanking pulse is of fixed level which depends
on the carrier strength. Thus this level forms the reference level for deriving AGC
(automatic gain control) voltage dependent on the carrier strength.
The amplitude of the Carrier will remain low for most of the time as the signal content is
more in white than in black. This will save the transmission power.
Disadvantages of Negative Modulation
Picture tube i.e. cathode ray tube needs positive polarity between control grid and
cathode. So the negative polarity of the video signal is to be changed to positive polarity
before feeding to the control grid.
High impulse pulse may disturb the H-sync and hence horizontal-synchronization may
be disturbed for a short duration. This effect is reduced by using automatic frequency
control in sweep circuits in the receiver.
In spite of these disadvantages, we use negative modulation in all TV systems because
it has more advantages.
5)B)Explain any two type of video signal standard

NTSC is an abbreviation for National Television Standards Committee, named for the group
that originally developed the black & white and subsequently color television system that is
used in the United States, Japan and many other countries. An NTSC picture is made up of
525 interlaced lines and is displayed at a rate of 29.97 frames per second.
PAL is an abbreviation for Phase Alternate Line. This is the video format standard used in
many European countries. A PAL picture is made up of 625 interlaced lines and is displayed
at a rate of 25 frames per second.
15
SECAM is an abbreviation for Sequential Color and Memory. This video format is used in
many Eastern countries such as the India, USSR, China, France, and a few others. Like
PAL, a SECAM picture is also made up of 625 interlaced lines and is displayed at a rate of
25 frames per second. However, the way SECAM processes the color information, it is not
compatible with the PAL video format standard.
Q6 A} what is bilevel images? Differentiate between JPEG and MPEG

Ans: Bilevel images, also known as binary images, are digital images that consist of
only two colors, typically black and white. Each pixel in a bilevel image can be either on
or off, representing the presence or absence of an element. Bilevel images are
commonly used for simple graphics, text, or line drawings, where color or grayscale is
unnecessary.
On the other hand, JPEG (Joint Photographic Experts Group) and MPEG (Moving
Picture Experts Group) are both commonly used image and video compression formats,
but they serve different purposes.
JPEG (Joint Photographic Experts Group) and MPEG (Moving Picture Experts Group)
are both digital image and video compression standards developed by their respective
organizations. However, JPEG is primarily designed for compressing still images, while
MPEG is designed for compressing video.
The main difference between JPEG and MPEG lies in their compression techniques.
JPEG uses lossy compression, which means that it discards some of the image data in
order to reduce the file size. This can result in a loss of image quality, particularly when
the compression level is high. On the other hand, MPEG uses both lossy and lossless
compression techniques to achieve high compression ratios while maintaining video
quality.
Another key difference between JPEG and MPEG is the way they handle motion. JPEG
is designed for still images, so it does not take motion into account. MPEG, on the other
hand, is specifically designed for compressing video and can handle both inter-frame
and intra-frame compression to reduce file sizes.
Overall, while JPEG and MPEG are both image compression standards, they are
optimized for different types of media and employ different compression techniques.
Q6 B} What is multimedia conferencing? differentiate multimedia and Hyper

media
16
Ans:Multimedia conferencing is a form of communication that allows multiple
participants to communicate using a combination of different media types, such as
audio, video, text, and images. It can be used for various purposes, including business
meetings, online learning, and remote collaboration.
Multimedia refers to the combination of different media types, such as text, audio, video,
and images, to deliver information. Multimedia is used in various applications, including
entertainment, education, advertising, and communication.
Hypermedia is a type of multimedia that is characterized by the inclusion of hyperlinks,

which allow the user to navigate between different media types and locations within a
document or application. Hypermedia is commonly used on the World Wide Web, where
hyperlinks are used to connect web pages and other online resources.
The main difference between multimedia and hypermedia is that multimedia refers to
the combination of different media types, while hypermedia refers to the use of
hyperlinks to navigate between those media types. Multimedia can exist without
hyperlinks, whereas hypermedia always includes hyperlinks as a fundamental feature.
In summary, multimedia conferencing is a form of communication that combines

different media types to enable remote collaboration, while hypermedia is a type of
multimedia that includes hyperlinks to enable navigation between different media types
and locations within a document or application.
Question 7. short note

a) Audio Latency-Audio latency refers to the delay between when an audio signal
is generated and when it is heard by the listener. It is a critical factor in real-time
audio applications, such as live performances, video conferencing, and
interactive gaming, where any noticeable delay can affect the user experience.
The latency in audio can be caused by several factors, including:
1. Processing Latency: This refers to the delay introduced by various digital

processing stages in the audio signal chain. It includes the time taken for
analog-to-digital conversion, digital signal processing (DSP) algorithms, and
digital-to-analog conversion. Each stage adds a certain amount of processing
time, which contributes to the overall latency.
2. Transmission Latency: When audio is transmitted over a network, there is a
delay introduced due to the time it takes for the data to travel from the source
to the destination. This latency depends on the network infrastructure,
including the distance, routing, and congestion.
17
3. Buffering Latency: In streaming applications, buffering is used to ensure a
smooth playback experience by preloading and storing a certain amount of
audio data before playback. The buffering introduces a delay to allow
sufficient data to be accumulated, reducing the chances of interruptions or
stutters during playback.
4. Hardware Latency: The hardware components involved in audio processing,
such as audio interfaces, sound cards, and audio drivers, can introduce
latency due to the time it takes for the data to pass through these
components.
Reducing audio latency is essential for maintaining a real-time and immersive audio
experience. It can be achieved through various techniques, including:
1. Optimizing Processing: Employing efficient algorithms and optimizing the

processing chain can help reduce the processing latency introduced by DSP
operations.
2. Minimizing Buffering: Reducing the size of audio buffers or implementing
adaptive buffering algorithms can help minimize the buffering latency without
compromising playback stability.
3. Network Optimization: Ensuring a robust and low-latency network
infrastructure, such as using wired connections instead of wireless, optimizing
routing, and reducing network congestion, can help reduce transmission
latency.
4. Using Low-Latency Hardware: Choosing audio interfaces, sound cards, and
drivers specifically designed for low-latency applications can help minimize
hardware-induced latency.
It's important to note that achieving extremely low latency can be challenging and may
require specialized hardware, software optimizations, and a well-designed system
architecture. The acceptable level of latency depends on the specific application and
user requirements.
b) Media Streaming Protocol-Streaming media is video or audio content sent in

compressed form over the internet and played immediately over a user's device, rather
than being saved to the device hard drive or solid-state drive.
A streaming protocol is a set of rules that define how data communicates from one device
or system to another across the Internet. Video streaming protocols standardized the
method of segmenting a video stream into smaller chunks that are more easily transmitted.
Common Streaming Protocols:
18
1 HTTP Live Streaming (HLS)
Type
● HTTP-Based
Pros
● Compatibility: The HLS protocol is suitable for streaming to practically any

internet-enabled device and operating system.
● Security: HLS is known for its secure streaming.
● Quality: HLS leverages adaptive bitrate streaming (ABR) technology to produce
ultra-high-quality video streams.
Cons
● Latency: HLS cannot maintain as low of latency as some of the other preferred
protocols, resulting in poor video quality.
● Poor Ingest: HLS isn't the best option for ingest since HLS-compatible encoders are
not accessible or affordable.
2. Dynamic Adaptive Streaming over HTTP (MPEG-DASH):-
Type
● HTTP-Based
Pros
● Adaptability: This protocol leverages ABR to stream at a high video quality over
different Internet speeds and conditions.
● Customization: MPEG-DASH is open-source, enabling users to tailor it to meet
their unique streaming needs.
Cons
● Limited Compatibility: MPEG-DASH is not compatible with Apple devices/iOS,

which significantly limits the reach of broadcasts.
● Obsolescence: While this protocol was once very popular, its limitations make it
difficult to fairly compete against the wide variety of other advanced protocol options.
3. WebRTC
Type
● Modern Protocol
Pros
19
● Flexibility: Since WebRTC is open-source, it is flexible enough that developers can
customize it to suit their specific streaming requirements.
● Real-Time Latency: WebRTC supports streaming with real-time latency, which

means that broadcasted video travels to viewers' screens in real-time at a high video
quality.
Cons
● Limited Support: The WebRTC video streaming protocol has only recently been
adopted as a web standard. The market has not had much time to adapt, engineers
might encounter compatibility issues with this streaming setup.
4. Secure Reliable Transport (SRT)
Type
● Modern Protocol
Pros
● Security: This protocol features top-notch security and privacy tools that allow
broadcasters to rest assured their streaming content and viewers remain safe.
● Compatibility: SRT is device and operating system agnostic, making it highly
compatible and able to deliver streams to most Internet-enabled devices.
● Low Latency: The SRT streaming protocol features low-latency streaming thanks to
the support from error correction technology.
Cons
● Limited Support: Similar to WebRTC, SRT is still considered futuristic, the larger
streaming industry will need some time to evolve before this video protocol becomes
standardized.
5. Transmission Control Protocol (TCP)
Type
● Internet Protocol
Pros
● Highly Reliable: Guarantees data delivery to the destination router, making it a

reliable protocol.
● Limited Errors: Offers extensive error-checking mechanisms using flow control and
data acknowledgment. Even if data packets arrive at a recipient's IP address out of
20
order or pieces are missing, the protocol communicates with the sender to ensure
each piece arrives where it should be.
Cons
● Slow Speed: The reordering and retransmission of the data packet cause TCP to
transmit slowly.
● Heavy Protocol: TCP requires three packets to set up a socket connection before
sending data.
c) Data Model for multimedia-
21
22
PAPER 2016
Que 1(a) what is multimedia ? describe various multimedia applications in detail.
MULTIMEDIA
Multimedia is a representation of information in an attractive and interactive manner with the use
of a combination of text, audio, video, graphics and animation. For examples: E-Mail, Yahoo
Messenger, Video Conferencing, and Multimedia Message Service (MMS).
Multimedia as the name suggests is the combination of Multi and Media that is many types of
media (hardware/software) used for communication of information.
Components of Multimedia
Multimedia consists of the following 5 components:
Text
Characters are used to form words, phrases, and paragraphs in the text. Text appears
in all multimedia creations of some kind. The text can be in a variety of fonts and sizes
to match the multimedia software’s professional presentation. Text in multimedia
systems can communicate specific information or serve as a supplement to the
information provided by the other media.
Graphics
Non-text information, such as a sketch, chart, or photograph, is represented digitally.
Graphics add to the appeal of the multimedia application. In many circumstances,
people dislike reading big amounts of material on computers. As a result, pictures are
more frequently used than words to clarify concepts, offer background information, and
so on. Graphics are at the heart of any multimedia presentation. The use of visuals in
multimedia enhances the effectiveness and presentation of the concept
Animations
A sequence of still photographs is being flipped through. It’s a set of visuals that give
the impression of movement. Animation is the process of making a still image appear to
move. A presentation can also be made lighter and more appealing by using animation.
In multimedia applications, the animation is quite popular. The following are some of the
most regularly used animation viewing programs: Fax Viewer, Internet Explorer, etc.
Video
Photographic images that appear to be in full motion and are played back at speeds of
15 to 30 frames per second. The term video refers to a moving image that is
accompanied by sound, such as a television picture. Of course, text can be included in
23
videos, either as captioning for spoken words or as text embedded in an image, as in a
slide presentation. The following programs are widely used to view videos: Real Player,
Window Media Player, etc.
Audio
Any sound, whether it’s music, conversation, or something else. Sound is the most
serious aspect of multimedia, delivering the joy of music, special effects, and other
forms of entertainment. Decibels are a unit of measurement for volume and sound
pressure level. Audio files are used as part of the application context as well as to
enhance interaction. Audio files must occasionally be distributed using plug-in media
players when they appear within online applications and webpages. MP3, WMA, Wave,
MIDI, and RealAudio are examples of audio formats
Applications of Multimedia
Entertainment
The usage of multimedia in films creates a unique auditory and video impression.
Today, multimedia has completely transformed the art of filmmaking around the world.
Multimedia is the only way to achieve difficult effects and actions.
The entertainment sector makes extensive use of multimedia. It’s particularly useful for
creating special effects in films and video games. Interactive games become possible
thanks to the use of multimedia in the gaming business. Video games are more
interesting because of the integrated audio and visual effects.
Business
Marketing, advertising, product demos, presentation, training, networked
communication, etc. are applications of multimedia that are helpful in many businesses.
The audience can quickly understand an idea when multimedia presentations are used.
It gives a simple and effective technique to attract visitors’ attention and effectively
conveys information about numerous products. It’s also utilized to encourage clients to
buy things in business marketing.
Engineering
Multimedia is frequently used by software engineers in computer simulations for military
or industrial training. It’s also used for software interfaces created by creative experts
and software engineers in partnership.
Fine Arts
Digital artist is a new word for these types of artists. Digital painters make digital
paintings, matte paintings, and vector graphics of many varieties using computer
applications.
List some advantages of Multimedia.
24
(i) It is interactive and integrated: The digitization process integrates all of the
numerous mediums. The ability to receive immediate input enhances interactivity.
(ii) It’s quite user-friendly: The user does not use much energy because they can
sit and watch the presentation, read the text, and listen to the audio.
(iii) It is Flexible: Because it is digital, this media can be easily shared. Adapted to
suit various settings and audiences.
(iv) It appeals to a variety of senses: It makes extensive use of the user’s senses
while utilizing multimedia, for example, hearing, observing and conversing
(v) Available for all type of audiences: It can be utilized for a wide range of
audiences, from a single individual to a group of people.
Ques 1(b) Any hardware that can send and receive data, instructions, and information is
referred to as a communications device. A modem is one kind of communication tool
that joins a channel to a sending or receiving device, like a computer. Data is processed
by computers as digital signals.
Types of Communication Device
There are the following types of Communication devices:
1. Dial-up Modem: As was previously said, digital signals from a computer must be
converted to analogue signals before being sent over regular telephone lines. This
conversion is carried out by a modem, often known as a dial-up modem, which is a
communications device. The phrase modulation, which turns a digital signal into an
analogue signal, and demodulates, which change an analogue signal into a digital
signal, are combined to form the word modem.
A modem typically takes the shape of an adapter card that you place in an expansion
slot on the motherboard of a computer. A normal telephone cord has two ends: one
plugs into a port on the modem card and the other into a phone outlet.
2. ISDN and DSL Modems: You require a communications device to transmit and
receive the digital signals from the provider if you use ISDN or DSL to access the
Internet. A computer's digital data and information can be transmitted to and received
from an ISDN connection using an ISDN modem. A DSL modem transfers and receives
digital data and information over a DSL line, both from a computer and another device.
Typically, ISDN and DSL modems are external devices that connect to a port on the
system unit at one end and the telephone line at the other.
25
3. Cable Modems: A cable modem transmits and receives digital data. Cable modems
offer a speedier option to dial-up for home users and have speeds comparable to DSL
because more than 110 million households are wired for cable television. Compared to
dial-up modems and ISDN, cable modems can currently transport data at speeds that
are substantially quicker.
4. Wireless Modems: Some mobile users have a wireless modem that connects to the
Internet wirelessly from a laptop computer, a Smartphone, or another portable device.
There are external or built-in wireless modes.ms that come in PC Cards, ExpressCard
modules, and flash memory card formats.
5. Network Cards: A network card, often referred to as an adapter card, PC Card,

ExpressCard module, USB network adapter, or flash card, enables a computer or other
device without networking capability to connect to a network.The network card controls
how data, commands, and information are sent and received to and from the computer
or device that it is installed.
There are many different types of network cards. A desktop computer's network card is
an adapter card with a port where a cable can be connected. Network cards for portable
computers and devices include things like flash cards, PC Cards, ExpressCard Modules,
and USB Network Adapters.There are also network cards that enable wireless data
transmission. Oftentimes, this kind of card - also known as a wireless network card -
has an antenna.
An Ethernet or token ring network card complies with the rules of a specific network
communications standard. The most popular kind of network card is an Ethernet card.
6. Wireless Access Points: A wireless access point is a hub for communications that
enables computers and other devices to communicate wirelessly with one another or
wirelessly with a wired network. For the best signal reception, wireless access points
include high-quality antennas.
7. Routers: A router is a communications tool that links several computers or other

routers and sends data to the right location on the network. Any size of a network can
use a router. On a broad scale, routers lining the backbone of the Internet forward data
packets to their destination via the quickest route. A router enables several computers
to share a single high-speed Internet connection, such as a cable modem or DSL
modem, for smaller corporate and residential networks. From two to 250 computers are
connected using these routers.
26
Q2 a) Explain the multimedia distributed processing model (ChatGPT)
The multimedia distributed processing model refers to a system architecture designed to handle
the processing and distribution of multimedia content across a network of interconnected
devices. This model is particularly useful when dealing with large-scale multimedia applications
that require significant computational resources and real-time delivery.
The key components of the multimedia distributed processing model include:
1. Source Devices: These are the devices that capture or generate multimedia content, such
as cameras, microphones, sensors, or software applications. Source devices produce raw or
encoded multimedia data, which serves as the input for the distributed processing model.
2. Processing Nodes: Processing nodes are the computational units responsible for executing
specific processing tasks on the multimedia data. They can be distributed across multiple
devices, such as personal computers, servers, or specialized hardware accelerators.
Processing nodes may perform various operations, such as encoding, decoding, filtering,
analysis, or transformation of the multimedia content.
3. Communication Network: The communication network interconnects the source devices,

processing nodes, and destination devices. It facilitates the transmission of multimedia data and
control signals between different components of the distributed processing system. The network
can be wired or wireless, depending on the specific requirements and constraints of the
multimedia application.
4. Middleware: Middleware acts as a software layer that abstracts the underlying hardware and
provides a common communication interface for the distributed processing model. It enables
efficient coordination and collaboration among the processing nodes and facilitates the
management of multimedia data streams. Middleware also handles data synchronization,
resource allocation, and error handling within the distributed system.
5. Destination Devices: Destination devices receive the processed multimedia content and
deliver it to the end users or display it on output devices. These devices can include personal
computers, smartphones, tablets, television screens, projectors, or other display and playback
devices. The processed multimedia data may be transmitted over the network or stored locally
for later retrieval.
The multimedia distributed processing model operates through a series of steps:
1. **Data Capture**: The source devices capture or generate multimedia content, such as
audio, video, or sensor data.
27
2. **Data Encoding**: The raw multimedia data is often encoded into a compressed format to
reduce the size and facilitate efficient transmission over the network. Encoding can be
performed at the source devices or dedicated encoding nodes.
3. **Data Distribution**: The encoded multimedia data is transmitted over the communication
network to the processing nodes responsible for executing specific tasks.
4. **Parallel Processing**: The distributed processing nodes concurrently perform their

assigned tasks on the multimedia data. This parallel processing allows for faster and more
efficient execution of computationally intensive operations.
5. **Data Fusion and Analysis**: After processing, the results from different processing nodes
may need to be combined or analyzed to obtain meaningful insights or generate a final
multimedia output.
6. **Data Delivery**: The processed multimedia data is delivered to the destination devices
through the communication network. It can be streamed in real-time or stored locally for later
retrieval.
7. **Presentation**: The destination devices present the processed multimedia content to the
end users through display or playback mechanisms, enabling them to perceive and interact with
the multimedia application.
The multimedia distributed processing model offers several advantages, including increased
scalability, enhanced computational power, improved fault tolerance, and efficient utilization of
resources. It enables the development of complex multimedia applications that require
high-performance processing and real-time delivery across distributed environments.
Q2 b) Explain any two components of multimedia system
The various components of multimedia are Text, Audio, Graphics, Video and Animation.
All these components work together to represent information in an effective and easy
manner.
AUDIO:
An audio format refers to the structure, encoding, and organization of audio data in a
digital file. It specifies how audio signals are stored, compressed, and represented,
determining crucial factors such as audio quality, file size, compatibility, and playback
capabilities. Audio formats enable the storage, transmission, and reproduction of sound
in various digital media applications.
Audio formats are broadly divided into three parts:
28
1. Uncompressed Format
2. Lossy Compressed format
3. Lossless Compressed Format
1. Uncompressed Audio Format:
● PCM –
It stands for Pulse-Code Modulation. It represents raw analog audio signals in
digital form. To convert analog signal into digital signal it has to be recorded at
a particular interval. Hence it has sampling rate and bit rate (bits used to
represent each sample). It is an exact representation of the analog sound and
does not involve compression. It is the most common audio format used in
CDs and DVDs
● WAV –
It stands for Waveform Audio File Format, it was developed by Microsoft and
IBM in 1991. It is just a Windows container for audio formats. That means that
a WAV file can contain compressed audio. Most WAV files contain
uncompressed audio in PCM format. It is just a wrapper. It is compatible with
both Windows and Mac.
● AIFF –
It stands for Audio Interchange File Format. It was developed by Apple for
Mac systems in 1988. Like WAV files, AIFF files can contain multiple kinds of
audio. It contains uncompressed audio in PCM format. It is just a wrapper for
the PCM encoding. It is compatible with both Windows and Mac.
2. Lossy Compressed Format:
It is a form of compression that loses data during the compression process. But the
difference in quality is not noticeable to hear.
● MP3 –
It stands for MPEG-1 Audio Layer 3. It was released in 1993 and became
popular. It is the most popular audio format for music files. Main aim of MP3 is
to remove all those sounds which are not hearable or less noticeable by
human ears. Hence making the size of the music file small. MP3 is like a
universal format which is compatible with almost every device.
● AAC –
It stands for Advanced Audio Coding. It was developed in 1997 after MP3.The
compression algorithm used by AAC is much more complex and advanced
than MP3, so when comparing a particular audio file in MP3 and AAC formats
at the same bitrate, the AAC one will generally have better sound quality. It is
the standard audio compression method used by YouTube, Android, iOS,
iTunes, and PlayStations.
● WMA –
It stands for Windows Media Audio. It was released in 1999.It was designed
29
to remove some of the flaws of the MP3 compression method. In terms of
quality it is better than MP3. But it is not widely used.
3. Lossless compression:
This method reduces file size without any loss in quality. But is not as good as lossy
compression as the size of file compressed to lossy compression is 2 and 3 times more.
● FLAC –
It stands for Free Lossless Audio Codec. It can compress a source file by up
to 50% without losing data. It is most popular in its category and is
open-source.
● ALAC –
It stands for Apple Lossless Audio Codec. It was launched in 2004 and
became free after 2011. It was developed by Apple.
● WMA –
It stands for Windows Media Audio. But it is least efficient in term of
compression and is not open-source. It has limited hardware support.
VIDEO:
Any video file has two components: a container and a codec. The video format is the
container for music, video, subtitles, and other metadata. A codec is a program that
encodes and decodes multimedia data like audio and video.
A video codec encodes and compresses the video, while an audio codec does the
same with sound while making a video. The encoded video and audio are then synced
and saved in a file format media container.
Video Codec
A video codec (the name codec originates from “enCOde / DECode”) is a protocol for
encoding and decoding video. H.264, MPEG-4, and DivX are examples of common
codecs. A well-designed codec has high efficiency or the capacity to maintain quality
while shrinking the file size.
Video Container
The container format specifies how the metadata and data in a file are organized, not
how the video is encoded (which the codec determines). The metadata and
compressed video data encoded with the codec are stored in the container file. The
file’s extension reflects the container format, also referred to as “the format.”.AVI,.MP4,
and .MOV are common container types. Container formats can be combined with a
variety of codecs to determine which devices and programs the file is compatible with.
30
Some of common container format of video files are as follows:
1. AVI format (.avi):

Audio Video Interleave basically contain both audios as well as video data. It was
basically developed by Microsoft. It basically uses less compression and contains
almost any type of codecs. It is commonly used by internet user due to multiple
codecs support. It basically means that even if AVI files may look same from
outside but they are different from one another on the inside basically all
windows OS support this type of format including another player for other
platform exist.
2. Flash Video Format (.flv):
Flash video format is very popular due to the availability of flash player for
cross-platform in the market. These type of video files are basically supported by
almost every browser making suitable for web purpose. This type of format is
compact and support progressive and streaming download. Some users who are
using this format are Youtube, Yahoo! Video, VEVO etc.
3. MP4 (.mp4):
This type of format is basically used to store to store audio and video stream
online. This type of file format was created by Moving Picture Experts Group
(MPEG) as a multimedia container format which can store audiovisual data. It
uses different compression technique for both video and audio.
4. 3GP (.3gp):
This type of format is basically design to store both audio and video file format
which was designed to transmit data between 3G phones and the internet. It is
most commonly used while we capture video from the phone and upload it
online. Both Windows OS and Mac OS supports this type of format.
5. WMV (Windows Media Video):
This type of video format was basically developed by Microsoft. It was basically
designed for web streaming applications. This type of files is very small in size
over the Web, as their file size decreases after compression and it basically leads
to poor quality of the video. But this makes it only file format that can be sent
through e-mail also.
6. QuickTime Format (.mov):

This type of file format was developed by Apple. It is basically used to store
multiple tracks(for a different language), text file(subtitle) and effects. MOV files
are of high quality and these are usually large in file size. It is supported both by
Windows OS and Apple OS.
31
Que 3 a)what is multimedia communication?explain any two multimedia
network?
Multimedia communication involves showing information in multiple media formats.
Images, video, audio and text all are part of multimedia communication. A single
instance of multimedia communication does not have to have all four components.
● Multimedia communication presents information in an interesting, creative way

that helps many different types of learners internalize the data.
● Infographics are a common example of multimedia communication.
● Websites are also prime examples of multimedia communication. Websites can
include all of the different types of media to present a single topic or idea, and
they are interactive so that the user can easily find the information and navigate
the pages. Many informative websites include videos that offer succinct,
engaging clips.
● Multimedia communication is particularly useful in the classroom with students
who have grown up with technology and know how to work it. Students are
comfortable with multimedia since the Internet itself is filled with all types of
media, and they can find a multimedia type that specifically benefits their
communication and learning styles.
Applications:
● person-to-person communications (e.g. email)
● person-to-system communications (e.g. web-browsing)
Different types of network that are used to provide multimedia communication services
Multimedia networks
http://www.eie.polyu.edu.hk/~enyhchan/mt_intro.pdf
3 b) what is multimedia information system?explain its features.
A multimedia information system is a computer system that processes and stores

multimedia data. Multimedia data includes text, images, audio, and video. Multimedia
information systems are used in a variety of fields, such as education, entertainment,
and business. Multimedia information systems have many advantages over traditional
information systems. They can store and process a larger amount of data in a shorter
amount of time. They can also provide a more immersive and interactive experience for
users. Multimedia information systems are not without their disadvantages. They can be
more expensive to develop and maintain than traditional information systems. They can
also be more complex to use, and may require special hardware and software.
32
Q4 a) Explain the concept of MPEG in detail.
The Moving Picture Experts Group (MPEG) is an alliance of working groups established
jointly by ISO and IEC that sets standards for media coding, including compression
coding of audio, video, graphics, and genomic data; and transmission and file formats
for various applications.
MPEG formats are used in various multimedia systems. The most well known older
MPEG media formats typically use MPEG-1, MPEG-2, and MPEG-4 AVC media coding
and MPEG-2 systems transport streams and program streams. Newer systems typically
use the MPEG base media file format and dynamic streaming (a.k.a. MPEG-DASH).
MPEG-1 (1993): Coding of moving pictures and associated audio for digital storage
media at up to about 1.5 Mbit/s
MPEG-2 (1996): Generic coding of moving pictures and associated audio information
(ISO/IEC 13818). Transport, video and audio standards for broadcast-quality television.
MPEG-2 standard was considerably broader in scope and of wider appeal – supporting
interlacing and high definition. MPEG-2 is considered important because it was chosen
as the compression scheme for over-the-air digital television ATSC, DVB and ISDB,
digital satellite TV services like Dish Network, digital cable television signals, SVCD and
DVD Video.[23] It is also used on Blu-ray Discs, but these normally use MPEG-4 Part
10 or SMPTE VC-1 for high-definition content.
MPEG-4 (1998): Coding of audio-visual objects. (ISO/IEC 14496) MPEG-4 provides a

framework for more advanced compression algorithms potentially resulting in higher
compression ratios compared to MPEG-2 at the cost of higher computational
requirements. MPEG-4 also supports Intellectual Property Management and Protection
(IPMP)
b) Sampling
33
Since an analogue image is continuous not just in its co-ordinates (x axis), but also in its
amplitude (y axis), so the part that deals with the digitizing of co-ordinates is known as
sampling. In digitizing sampling is done on independent variable. In case of equation y =
sin(x), it is done on x variable.
When looking at this image, we can see there are some random variations in the signal
caused by noise. In sampling we reduce this noise by taking samples. It is obvious that
more samples we take, the quality of the image would be more better, the noise would
be more removed and same happens vice versa. However, if you take sampling on the
x axis, the signal is not converted to digital format, unless you take sampling of the
y-axis too which is known as quantization.
Sampling has a relationship with image pixels. The total number of pixels in an image
can be calculated as Pixels = total no of rows * total no of columns. For example, let’s
say we have total of 36 pixels, that means we have a square image of 6X 6. As we
know in sampling, that more samples eventually result in more pixels. So it means that
of our continuous signal, we have taken 36 samples on x axis. That refers to 36 pixels
of this image. Also the number sample is directly equal to the number of sensors on
CCD array.
Here is an example for image sampling and how it can be represented using a graph.
Quantization
34
Quantization is opposite to sampling because it is done on “y axis” while sampling is
done on “x axis”. Quantization is a process of transforming a real valued sampled image
to one taking only a finite number of distinct values. Under quantization process the
amplitude values of the image are digitized. In simple words, when you are quantizing
an image, you are actually dividing a signal into quanta(partitions).
Now let’s see how quantization is done. Here we assign levels to the values generated
by sampling process. In the image showed in sampling explanation, although the
samples has been taken, but they were still spanning vertically to a continuous range of
gray level values. In the image shown below, these vertically ranging values have been
quantized into 5 different levels or partitions. Ranging from 0 black to 4 white. This level
could vary according to the type of image you want.
There is a relationship between Quantization with gray level resolution. The above
quantized image represents 5 different levels of gray and that means the image formed
from this signal, would only have 5 different colors. It would be a black and white image
more or less with some colors of gray.
When we want to improve the quality of image, we can increase the levels assign to the
sampled image. If we increase this level to 256, it means we have a gray scale image.
Whatever the level which we assign is called as the gray level. Most digital IP devices
uses quantization into k equal intervals. If b-bits per pixel are used,
The number of quantization levels should be high enough for human perception of fine
shading details in the image. The occurrence of false contours is the main problem in
image which has been quantized with insufficient brightness levels. Here is an example
for image quantization process.
35
Q6 a} explain the NTSC and PAL video standard
Ans: NTSC (National Television System Committee) and PAL (Phase Alternating Line) are two
different video standards used for analog television broadcasting and playback. Here's an
explanation of each standard:
1. NTSC:
NTSC is the video standard primarily used in North America, parts of South America, Japan,
and some other countries. It was developed by the National Television System Committee.
NTSC video has the following characteristics:
- Frame rate: NTSC operates at a frame rate of 29.97 frames per second (approximately 30
frames per second).
- Resolution: The standard resolution for NTSC is 720 x 480 pixels.
- Color encoding: NTSC uses a composite video signal where color information is encoded into
the luminance signal (brightness) using a process called color subcarrier modulation.
- Aspect ratio: The aspect ratio for NTSC is 4:3, which means the width of the screen is four
units for every three units of height.
2. PAL:
PAL is the video standard used in most of Europe, Australia, Asia, Africa, and some parts of
South America. It stands for Phase Alternating Line. PAL video has the following characteristics:
- Frame rate: PAL operates at a frame rate of 25 frames per second.
- Resolution: The standard resolution for PAL is 720 x 576 pixels.
- Color encoding: PAL uses a different color encoding system called phase alternation, which
helps in reducing color artifacts and provides better color reproduction compared to NTSC.
- Aspect ratio: PAL also has an aspect ratio of 4:3, similar to NTSC.
One notable difference between NTSC and PAL is the frame rate. NTSC has a higher frame
rate than PAL, but PAL offers better color reproduction and is less prone to video artifacts. As a
result, NTSC and PAL videos are not directly compatible, and TVs and video players need to
support both standards to play content from different regions.
With the advent of digital television and the transition to high-definition formats, the NTSC and
PAL standards have become less relevant. Most modern systems now use digital video
standards such as ATSC (Advanced Television Systems Committee) in North America and DVB
(Digital Video Broadcasting) in Europe and other regions. These digital standards offer higher
resolutions, improved image quality, and additional features compared to their analog
predecessors.
36
Q5) (a) What is JPEG. Enumerate its objectives, applications and architecture
Image compression
Image compression is the method of data compression on digital images.
The main objective in the image compression is:
Store data in an efficient form

Transmit data in an efficient form
Image compression can be lossy or lossless.
JPEG compression
JPEG stands for Joint photographic experts group. It is the first interanational standard in image
compression. It is widely used today. It could be lossy as well as lossless . But the technique we are
going to discuss here today is lossy compression technique.
What is JPEG (Joint Photographic Experts Group)?
JPEG (pronounced JAY-peg) is a graphic image file compressed with lossy compression using the
standard developed by the ISO/IEC Joint Photographic Experts Group. JPEG is one of the most
popular image formats on the internet and is supported by most web browsers and image editing
software.
How does a JPEG work?
JPEG images are compressed using a technique called discrete cosine transform (DCT). DCT is a
mathematical algorithm that breaks down an image into a series of cosine functions. These
functions are then compressed using a variety of techniques, such as quantization and Huffman
coding.
The amount of compression that can be applied to a JPEG image depends on the quality setting. The
higher the quality setting, the less compression is applied, and the larger the file size. The lower the
quality setting, the more compression is applied, and the smaller the file size.
What are the benefits of using a JPEG?
There are many benefits to using JPEG, including the following:
37
● Small file size. JPEG images are typically much smaller than uncompressed images,
which make them ideal for storing and transmitting images over the internet.
● Wide support. JPEG is supported by most web browsers and image editing software.
● Good quality. JPEG images can be of good quality, even with a high degree of
compression.
Table comparing the differences between lossy (for example, JPEG) and lossless compression.
What are the drawbacks of using a JPEG?
There are a few drawbacks to using JPEG, including the following:
● Lossy compression. Some data is lost during the compression process, which can
lead to a loss of quality in the image.
● Not suitable for all images. JPEG is not suitable for all images. For example, it is not a
good choice for images with sharp edges or fine details.
38
Q5)b) explain process of
Speech Recognition
Speech Recognition And Generation
Speech Recognition is the ability to translate a dictation or spoken word to text.
It is also known as Speech-to-Text and Voice Recognition.
It is achieved by following certain steps and the software responsible for it is known as a ‘Speech
Recognition System’.
SR systems are usually implemented in the form of dictation software and intelligent assistants in personal
computers, smartphones, web browsers and many other devices.
Process Of Speech Recognition

Terminology
Utterances: An utterance is any stream of speech between two periods of silence
Pronunciation: What the speech engine thinks a word should sound like.
Grammar: It defines the domain, or context, within which the recognition engine works.
Accuracy: The ability of a recognizer can be examined by measuring its accuracy − or how well it recognizes
utterances.
Training: Some speech recognizers have the ability to adapt to a speaker. When the system has this ability, it
may allow training to take place.
Speech generation(synthesis)
Speech synthesis is the artificial production of human speech. A computer system used for
this purpose is called a speech synthesizer, and can be implemented in software or hardware
products. A text-to-speech (TTS) system converts normal language text into speech; other
39
systems render symbolic linguistic representations like phonetic transcriptions into
[1]
speech. The reverse process is speech recognition.
Synthesized speech can be created by concatenating pieces of recorded speech that are
stored in a database. Systems differ in the size of the stored speech units; a system that
stores phones or diphones provides the largest output range, but may lack clarity. For
specific usage domains, the storage of entire words or sentences allows for high-quality
output. Alternatively, a synthesizer can incorporate a model of the vocal tract and other
[2]
human voice characteristics to create a completely "synthetic" voice output.
The quality of a speech synthesizer is judged by its similarity to the human voice and by its
ability to be understood clearly. An intelligible text-to-speech program allows people with
visual impairments or reading disabilities to listen to written words on a home computer.
Many computer operating systems have included speech synthesizers since the early 1990s.
Q6b} explain the JBIG Bilevel image compression standard

Ans: JBIG (Joint Bi-level Image Experts Group) is a standard for compressing bilevel (black and
white) images. It is a lossless compression technique, which means that the original image can
be completely reconstructed from the compressed data without any loss of information.
The JBIG standard works by analyzing the input image and identifying regions of similar pixels,
called segments. Each segment is then replaced with a compressed representation that
includes a description of the segment's shape and a set of reference pixels that represent the
segment's grayscale values. The reference pixels are used to determine the color of each pixel
in the segment, based on a mathematical model that takes into account the surrounding pixels.
The JBIG algorithm also uses a technique called arithmetic coding to further compress the data.
Arithmetic coding assigns a unique code to each combination of symbols in the compressed
data, such as pixels or segments, based on their frequency of occurrence. This allows the most
common symbols to be assigned shorter codes, resulting in further compression.
The JBIG standard offers several benefits over other compression techniques for bilevel
images. Firstly, it can achieve high compression ratios while maintaining a high level of image
quality, making it ideal for applications such as document scanning and archiving. Additionally,
JBIG compression is efficient for compressing multiple pages of documents with similar content,
such as faxes or reports.
Overall, the JBIG standard is a powerful and effective compression technique for bilevel images,
offering high compression ratios and excellent image quality. It has been widely adopted in
applications such as fax transmission, document imaging, and digital archiving.
Q. 7 Write short notes on any two of the following.
40
a. Need of compression
b. Multimedia conferencing
c. Media stream protocol.
Ans: A. In the realm of multimedia technology, the need for compression has emerged
as an indispensable aspect. Compression refers to the process of reducing the size of
data files while maintaining their perceptual quality. It plays a vital role in various
multimedia applications, such as audio, video, images, and more. Some of the reasons
why compression is crucial in multimedia technology:
Efficient Storage: Multimedia files, such as videos, audio tracks, and high-resolution
images, often occupy a significant amount of storage space. Compression techniques
enable the reduction of file sizes without sacrificing their quality. By compressing
multimedia files, it becomes feasible to store and manage a vast amount of data
efficiently. This benefit is particularly crucial in scenarios where storage resources are
limited or costly.
Bandwidth Optimization: Transferring multimedia content over networks, especially the

internet, necessitates careful consideration of bandwidth usage. Large-sized files
consume substantial bandwidth, resulting in slower transmission and potential network
congestion. Compression minimizes the file size, enabling faster data transfer, reducing
network congestion, and optimizing bandwidth usage. It is particularly beneficial in
streaming services, where the efficient utilization of bandwidth is essential for providing
smooth playback.
Quick Data Transmission: In multimedia applications, such as video conferencing or

real-time communication, the transmission speed is crucial. Compression techniques
allow for faster transmission of multimedia content by reducing the file size. This
advantage ensures that data can be transmitted swiftly, enabling real-time interactions
without significant delays or buffering.
Device and Platform Compatibility: Multimedia content is consumed across a wide array
of devices, platforms, and media players. Compression facilitates compatibility by
reducing the file size, making it easier to transfer and play multimedia files on various
devices with varying storage capacities and computational capabilities. It ensures that
multimedia content can be accessed and enjoyed across different platforms without
encountering compatibility issues.
Cost Savings: Storage and bandwidth requirements directly influence the cost
associated with multimedia applications. By compressing multimedia files, organizations
can significantly reduce storage expenses and optimize bandwidth usage, leading to
substantial cost savings. This benefit is particularly important for businesses that deal
41
with large volumes of multimedia data, such as media production companies or
streaming platforms.
B. Multimedia conferencing, also known as video conferencing or virtual meetings, has

revolutionized the way people communicate and collaborate remotely. It refers to the
use of multimedia technologies to facilitate real-time communication among individuals
or groups located in different geographical locations. Here's a short note on multimedia
conferencing:
Multimedia conferencing enables face-to-face communication, despite physical

distance, by leveraging audio, video, and interactive features. It has gained immense
popularity in various domains, including business, education, healthcare, and personal
communication. The key features and benefits of multimedia conferencing are as
follows:
Real-Time Communication: Multimedia conferencing allows participants to communicate

and interact in real time, simulating an in-person meeting experience. Through video
and audio channels, participants can see, hear, and speak to each other, fostering a
sense of presence and enhancing communication effectiveness.
Collaboration and Content Sharing: Multimedia conferencing platforms often provide

collaborative tools, such as screen sharing, file sharing, and virtual whiteboards. These
features enable participants to share documents, presentations, and other relevant
content, facilitating collaborative work, brainstorming sessions, and information
exchange.
Cost and Time Savings: By eliminating the need for travel and accommodation,
multimedia conferencing significantly reduces travel expenses and saves time.
Participants can join meetings from their own locations, avoiding the logistical
challenges associated with physical meetings. This benefit is particularly valuable for
global businesses, remote teams, and individuals seeking efficient communication
solutions.
Increased Productivity: Multimedia conferencing enhances productivity by enabling

quick decision-making, efficient communication, and seamless collaboration. It
eliminates delays caused by travel time, allows for immediate communication, and
provides a platform for instant sharing of information. These factors contribute to
improved efficiency and productivity in both professional and personal settings.
42
Flexibility and Accessibility: Multimedia conferencing offers flexibility and accessibility,
allowing participants to connect from anywhere with an internet connection. This
accessibility breaks down geographical barriers, enabling individuals from different
locations and time zones to participate in meetings and discussions conveniently.
Multi-Modal Communication: Multimedia conferencing combines audio, video, and

text-based communication, providing participants with multiple modes of expression.
This diversity enhances communication richness, as individuals can use visual cues,
gestures, and facial expressions to convey their message effectively.
Enhanced Collaboration and Engagement: Multimedia conferencing promotes active

participation and engagement through features like video feeds, chat functionality, and
interactive tools. It fosters collaboration, engagement, and better understanding among
participants, even in virtual environments.
C. A streaming protocol, also known as a broadcast protocol, is a standardized method

of delivering different types of media (usually video or audio) over the internet.
Essentially, a video streaming protocol sends “chunks” of content from one device to
another. It also defines the method for “reassembling” these chunks into playable
content on the other end.
That points toward one important aspect of streaming protocols: both the output device
and the viewer have to support the protocol in order for it to work.
For example, if you’re sending a stream in MPEG-DASH, but the video player on the
device to which you’re streaming doesn’t support MPEG-DASH, your stream won’t
work.
For this reason, standardization is important. There are currently a few major media
streaming protocols in widespread use, which we’ll look at in detail in a moment. Six
common protocols include:
● HTTP Live Streaming (HLS)

● Real-Time Messaging Protocol (RTMP)
● Secure Reliable Transport (SRT)
● Dynamic Adaptive Streaming over HTTP (MPEG-DASH)
● Microsoft Smooth Streaming (MSS)
● Web Real-Time Communication (WebRTC)
43

Multimedia - ECC

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Multimedia - ECC

Uploaded by

Copyright:

Available Formats

PAPER 2017

Ques 1(a) What is Animation? How is it different from Images?

How is Sound Produced and Transmitted?

Noise and Music

Q2 a) Explain SCSI and MCI

Some basic commands

Q2 b) Explain human speech production mechanism

Que 3 a) Explain Multimedia Architecture?

Q4 a) explain digital model of speech recognition

b) what is speech compression? explain low bit rate speech compression.

2. Decoding: The compressed representation is then decoded at the receiver or

Speech compression finds application in various domains, including

2. Psychoacoustic modeling: Human perception of sound allows for certain imperceptible

Q5)a) what are composite video signals?

Composite video signal consists of

A camera signal corresponding to the desired picture information,

Advantages of Negative Modulation

5)B)Explain any two type of video signal standard

Q6 A} what is bilevel images? Differentiate between JPEG and MPEG

Q6 B} What is multimedia conferencing? differentiate multimedia and Hyper

Hypermedia is a type of multimedia that is characterized by the inclusion of hyperlinks,

In summary, multimedia conferencing is a form of communication that combines

Question 7. short note

The latency in audio can be caused by several factors, including:

1. Processing Latency: This refers to the delay introduced by various digital

1. Optimizing Processing: Employing efficient algorithms and optimizing the

b) Media Streaming Protocol-Streaming media is video or audio content sent in

Common Streaming Protocols:

● Compatibility: The HLS protocol is suitable for streaming to practically any

● Limited Compatibility: MPEG-DASH is not compatible with Apple devices/iOS,

● Real-Time Latency: WebRTC supports streaming with real-time latency, which

4. Secure Reliable Transport (SRT)

5. Transmission Control Protocol (TCP)

● Highly Reliable: Guarantees data delivery to the destination router, making it a

c) Data Model for multimedia-

There are the following types of Communication devices:

5. Network Cards: A network card, often referred to as an adapter card, PC Card,

7. Routers: A router is a communications tool that links several computers or other

The key components of the multimedia distributed processing model include:

3. Communication Network: The communication network interconnects the source devices,

The multimedia distributed processing model operates through a series of steps:

4. **Parallel Processing**: The distributed processing nodes concurrently perform their

Q2 b) Explain any two components of multimedia system

Audio formats are broadly divided into three parts:

1. AVI format (.avi):

6. QuickTime Format (.mov):

● Multimedia communication presents information in an interesting, creative way

3 b) what is multimedia information system?explain its features.

A multimedia information system is a computer system that processes and stores

MPEG-4 (1998): Coding of audio-visual objects. (ISO/IEC 14496) MPEG-4 provides a

The main objective in the image compression is:

​ Store data in an efficient form

Image compression can be lossy or lossless.

What is JPEG (Joint Photographic Experts Group)?

How does a JPEG work?

What are the benefits of using a JPEG?

There are many benefits to using JPEG, including the following:

What are the drawbacks of using a JPEG?

There are a few drawbacks to using JPEG, including the following:

It is also known as Speech-to-Text and Voice Recognition.

Process Of Speech Recognition

Q6b} explain the JBIG Bilevel image compression standard

Q. 7 Write short notes on any two of the following.

4. Parallel Processing: The distributed processing nodes concurrently perform their

Store data in an efficient form