Professional Documents
Culture Documents
Multimedia - ECC
Multimedia - ECC
Images are a media type displayed as visual information. They can be drawings,
paintings or photographs. Images are used to create interest and provide information.
Photographs and other types of graphical data are designed specifically for display. An
image on a screen is made up of dots called pixels. A pixel is the smallest part of the
screen that can be controlled by the computer or other device. The total number of
pixels on a screen is called its resolution. (ie New iPad has Retina display, 2048 x 1536
resolution). An image can be represented in two different ways. Either a Bitmap or a
Vector. Typical file formats for a bitmap can be JPEG, GIF, PNG and BMP. Vector
images can be SVG, WMF and EMF.
An image consists of a rectangular array of dots called pixels. The size of the image is
specified in terms of width X height, in numbers of the pixels. The physical size of the
image, in inches or centimeters, depends on the resolution of the device on which the
image is displayed. The resolution is usually measured in DPI (Dots Per Inch). An
image will appear smaller on a device with a higher resolution than on one with a
lower resolution. For color images, one needs enough bits per pixel to represent all the
colors in the image. The number of the bits per pixel is called the depth of the image.
Que 1(b) What is Sound?
We can describe the sound as a form of energy that is caused by the vibration of
objects. Vibration is rapid to and from the motion of any object or particle. We sense
this vibration in our ears as a sound.
Characteristics of Sound:
Sound has specific characteristics that define the way we hear it. Some of the major
characteristics of sound are as follows:
Wavelength
It is the length between two adjacent areas of compression or rarefaction.In transverse
waves, it is the length between two adjacent peaks or two adjacent troughs.
Pitch
Pitch is a characteristic of sound by which a correct note can be distinguished from a
grave or a flat note. We can identify a female and male voice without seeing them.
The term ‘pitch’ is often used in music. Pitch depends upon the frequencies of the
sound wave. A note has a higher pitch when the frequency is high and a note of low
frequency has a low pitch. For example, when a small baby speaks something,
his/her voice has a higher frequency so in case of a baby the pitch is higher than the
pitch of a man. The sound with a high frequency is called shrill.
Loudness
The loudness is a sensation of how strong a sound wave is at a place. It is always a
relative term and is a dimensionless quantity. Loudness is measured in decibel (dB). It
is given as:
L = log(I), here ‘I’ is the intensity.
1
The loudness depends on the amplitude of the vibration. It will be louder when the
amplitude is high. Suppose when we pluck a string of the sitar it starts vibrating with low
amplitude and if we apply more energy by plucking more strongly, the string will vibrate
with the greater amplitude and produce a loud sound. As the amplitude of vibration
increases, sound also increases.
Loudness is the measure of the intensity or strength of the sound waves. In waves, it is
another term for amplitude. Amplitude is the amount of displacement of the particles as
they vibrate in the medium. The more the displacement, the more is the amplitude; that
is, louder is the sound.
Pluck the string of a guitar lightly. It vibrates with less amplitude giving out a low sound.
Pluck it with more force. The sound is louder due to its higher amplitude.
Quality
The word timbre also describes the term quality. As different sources produce
different sounds, the timbre helps us to distinguish between them. A sound of good
quality is pleasant to listen to. The instruments are of different shapes and sizes and
they produce different harmonics of loudness hence their sound can be easily
distinguished.
Timbre
In a musical concert, we can clearly hear the different sounds coming from the various
musical instruments. For example, we can distinguish between the piano, drum, sitar,
clarinet or flute. How is it possible?
It is possible because of a property of sound called timbre. In simple words, timbre
means the quality of the sound. In a picture, each part has its own color, and our eyes
can distinguish between them. Therefore, timbre is equivalent to color. Therefore, it is
also called tone color.
Two sounds from different sources can have the same pitch (frequency) and loudness
(amplitude). But we can distinguish between them because of the difference in their
timbres. This is because each sound has its own waveform or the shape of its wave. A
waveform is formed by mixing up waves of different frequencies. So every object will
create a sound with its own waveform. For example, the waveform of a flute is different
from that of a veena.
Music has an organized structure in its waveform. This makes it pleasant to listen to.
We can define noise as a random set of waves that causes unpleasantness when we
2
hear it. Multiple waves of different frequencies and amplitudes get mixed up, giving a
jarring effect when we listen to it. Sounds of machines, traffic, and crowded places are
examples of sources of noise. When this disorganized sound becomes too loud, it is
noise pollution and may lead to health disorders.
Types of SCSI
1. SCSI-1: SCSI-1 features an 8-bit data bus, allowing for the parallel transfer of 8
bits of data simultaneously. It supports a data transfer rate of up to 5 MB/s.
2. SCSI-2: SCSI-2 maintains an 8-bit data bus similar to SCSI-1 and offers backward
compatibility. It supports data transfer rates of up to 10 MB/s.
3. Fast SCSI: Fast SCSI, also known as SCSI-2 Fast, operates with an 8-bit data bus
and provides increased data transfer rates ranging from 10 MB/s to 20 MB/s.
4. Wide SCSI: Wide SCSI expands the data bus width to 16 bits, effectively doubling
the data transfer rate compared to Fast SCSI. It supports data rates of up to 20
MB/s or 40 MB/s.
5. Ultra SCSI: Ultra SCSI retains the 8-bit data bus but introduces low-voltage
differential signaling (LVDS) to improve signal quality and noise reduction. It
offers data transfer rates of up to 20 MB/s.
6. Ultra Wide SCSI: Ultra Wide SCSI combines the wider 16-bit data bus of Wide
SCSI with the higher data transfer rates of Ultra SCSI, resulting in speeds of up to
40 MB/s.
7. Ultra2 SCSI: Ultra2 SCSI supports an 8-bit data bus and employs LVD/SE (Low
Voltage Differential/Single Ended) signaling. It achieves data transfer rates of 40
MB/s to 80 MB/s.
8. Ultra3 SCSI: Ultra3 SCSI, also known as Ultra160 SCSI, operates with an 8-bit data
bus and offers data transfer rates of up to 160 MB/s.
9. Ultra320 SCSI: Ultra320 SCSI maintains the 8-bit data bus and delivers data
transfer rates of up to 320 MB/s.
3
10. Serial Attached SCSI (SAS): SAS is a newer SCSI standard that uses a serial data
transfer method. It offers higher data rates and improved scalability compared to
parallel SCSI. The data bus width for SAS can vary, with 1, 2, 4, or 8 lanes
available.
Features of SCSI:
● High-speed data transfer: SCSI interfaces offer faster data transfer rates
compared to older interfaces like parallel ports.
● Wide device support: SCSI supports a wide range of peripheral devices, including
hard drives, optical drives, tape drives, scanners, and more.
● Daisy-chaining: SCSI allows multiple devices to be connected in a chain or bus,
simplifying the connectivity of multiple peripherals.
● Command set: SCSI has a versatile command set that enables advanced
functionality and control over connected devices.
● Flexibility: SCSI interfaces can support a variety of data transfer modes and
configurations, allowing for flexible device connectivity.
Drawbacks of SCSI:
● Cost: SCSI devices and cables can be more expensive compared to other
interface options.
● Cable length and flexibility: SCSI cables have limitations in terms of length and
flexibility, which can be challenging in certain setups.
● Device limitations: SCSI has limitations on the maximum number of devices that
can be connected in a chain.
● Complexity: SCSI can be more complex to set up and configure compared to
other interface standards.
● Compatibility: Compatibility issues may arise when connecting different SCSI
devices or when connecting SCSI devices with newer computer systems.
● Reduced popularity and support: The demand for SCSI has declined with the
emergence of newer interface technologies, leading to reduced availability and
support.
MCI
In the context of multimedia, MCI stands for Media Control Interface. MCI is a
programming interface that provides a standardized way to control and manage
multimedia devices and their associated resources, such as sound cards, CD-ROM
4
drives, video capture cards, and MIDI devices. It was developed by Microsoft as part of
the Windows operating system.
MCI provides a set of commands and functions that allow applications to perform
various multimedia operations, including playback, recording, seeking, volume control,
and device configuration. By using MCI, multimedia applications can interact with
different multimedia devices in a consistent and platform-independent manner.
MCI Devices
The Media Control Interface consists of 4 parts:
● cdaudio
● Digital video
● overlay
● sequencer
● VCR
● Video disc
● waveaudio
Each of these so-called MCI devices (e.g. CD-ROM or VCD player) can play a certain type of files,
e.g. AVIVideo plays .avi files, CDAudio plays CD-DA tracks among others. Other MCI devices have
also been made available over time.
5
The physiology of speech:-
(i) Speech signal is generated when air expelled from the lungs acoustically excites the
vocal cords producing a sequence sounds.
(ii) Lungs along with the diaphragm, are the main source of production for speech
signals.
(iii) There are 3 main cavities of vocal tract:
(a) Pharynx (b) Oral cavity (c) Nasal cavity
(iv) While speaking the air gushes through the V-shaped opening called glottis and the
larynx to the vocal tract.
(v) Its basic function is to manipulate the airflow by swiftly opening and closing the
valves in the vocal cord, producing variety of sounds.
(vi) The vibration frequency is dependent on mass and tension. It varies from person to
person.
(vii) From the nasal cavity to the pharynx, soft palate acts as a medium for connecting
and isolating them from each other. The lower end of pharynx consists of Epiglottis and
6
false vocal cord, to prevent the food from entering the larynx and are closed during
swallowing, while opening during respiration.
(viii) Acoustics can vary with the help of moving our lips, tongue, palate, teeth and
cheeks. It also depends on their size and shape. The walls and construction in the vocal
tract also generates sound.
(ix) Organs responsible for speech are lungs, larynx and vocal tract, where the lungs
provides the larynx with the airflow, while the larynx then modifies it to produce noisy
wind-flow for vocal tract.
(x) Periodic, noisy and impulsive source are the basic categories of sound, where they
generally are used in combinations. Example: the word 'shop' uses all three sources,
where |sh| is noisy, |o| periodic and |p| impulsive.
7
While focusing on multimedia system architecture, the most important point under
consideration is on multimedia applications.
• Applications can be training app, conferencing app, messaging app, education app.
8
9
10
3 b) what do you understand by multimedia files synchronization?
Lecture 11 - Synchronization in Multimedia Systems
Formant synthesis models the vocal tract as a digital filter with resonators and anti
resonators These systems use a low-pass filtered periodic pulse train as a source for
voiced signals and a random noise generator as an unvoiced source. A mixture of the
two sources can also be used for speech units that have both voiced and unvoiced
properties. Rules are created to specify the time varying values for the control
11
parameters for the filter and excitation. probably the most used commercial synthesis
system. Linear predictive synthesis uses linear predictive coding to represent the output
signal. Each speech sample is represented as a linear combination of the N previous
samples plus an additive excitation term. As in formant synthesis, the excitation term
uses a pulse train for a voiced signal and noise for un-voiced.
While the source-filter-based systems are capable of producing quite intelligible speech,
they have a distinct mechanical sound, and would not be mistaken for a human. This
quality arises from the simplifying assumptions made by the model and therefore is not
easily remedied.
Speech production model
In order to synthesize speech sounds artificially; we need a model of the speech
production system.
Speech compression, also known as speech coding or voice compression, refers to the
process of reducing the amount of data required to represent speech signals. It involves
encoding speech information in a compressed format so that it can be efficiently
transmitted, stored, or processed while maintaining an acceptable level of speech
quality.
Speech compression is necessary because raw speech signals typically contain a large
amount of redundant and irrelevant information. By applying various compression
techniques, it becomes possible to significantly reduce the data size of speech signals
without significantly degrading their intelligibility and quality.
The compression process involves two main steps: encoding and decoding.
1. Encoding: In this step, the speech signal is analyzed and transformed into a
compressed representation. Various techniques are employed to reduce redundancy
12
and exploit the characteristics of human speech perception. This includes methods such
as predictive coding, transform coding, quantization, and psychoacoustic modeling. The
encoded representation typically requires fewer bits than the original signal.
Low bit rate speech compression specifically focuses on compressing speech signals at
low data rates, typically ranging from a few kilobits per second (kbps) to a few tens of
kilobits per second. This type of compression is often employed in various applications
such as telephony, voice over IP (VoIP), mobile communication, and streaming services
where bandwidth or storage limitations exist.
Low bit rate speech compression algorithms employ various techniques to reduce the
data size of speech signals. Here are a few commonly used methods:
1. Source coding: Source coding techniques exploit the statistical properties of speech
signals to remove redundancy. This includes methods like predictive coding, transform coding,
and vector quantization. These techniques analyze the input speech signal, identify predictable
patterns, and represent them using fewer bits.
3. Variable bit rate coding: Instead of using a fixed data rate for compression, variable
bit rate coding allocates more bits to complex or critical parts of the speech signal while
using fewer bits for less important sections. This adaptive allocation allows for efficient
use of available data rate, optimizing the overall quality.
4. Speech codecs: Speech codecs are specifically designed algorithms that combine
several compression techniques to achieve efficient speech compression. Common low
bit rate speech codecs include G.729, GSM, and Speex.
13
By applying these techniques, low bit rate speech compression algorithms can achieve
significant data reduction while maintaining reasonable speech quality.
In other words the picture information is not transmitted alone. They carries the various
signals along with such as the blanking pulses to make the retrace invisible and
synchronizing pulses to synchronize the scanning at the transmitter and at the receiver.
All the components all together are known as Composite Video Signal (CVS).
Positive Polarity
In the case of Positive polarity, whiter the scene, higher is the amplitude of the video
signal. Blanking level is kept at the zero level. Below zero level is the sync pulse. Sync
top is at the most negative point as shown in figure below.
In case when the video signal is produced by photo conduction type camera tubes,
bright white light gives a high amplitude of video signal. At the receiver; for reproducing
white on the fluorescent screen, a stronger signal is needed and for reproducing black,
zero signals are needed,
Positive polarity
Positive polarity
Negative Polarity
In case of negative polarity, brighter the scene, smaller is the amplitude. Here sync
pulse is positive, that is above the blanking level. Black is just below the blanking level.
Brighter the scene, lower is its level below the blanking level. White is near the bottom.
14
Negative polarity
Negative polarity
The modulation of the RF carrier by the CVS is in the form of negative AM, where bright
picture points correspond to low carrier amplitude and the sync pulse to maximum
carrier amplitude. This type of modulation is called Negative Modulation.
In most of the TV systems, negative polarity is used to modulate the video carrier.
PAL is an abbreviation for Phase Alternate Line. This is the video format standard used in
many European countries. A PAL picture is made up of 625 interlaced lines and is displayed
at a rate of 25 frames per second.
15
SECAM is an abbreviation for Sequential Color and Memory. This video format is used in
many Eastern countries such as the India, USSR, China, France, and a few others. Like
PAL, a SECAM picture is also made up of 625 interlaced lines and is displayed at a rate of
25 frames per second. However, the way SECAM processes the color information, it is not
compatible with the PAL video format standard.
On the other hand, JPEG (Joint Photographic Experts Group) and MPEG (Moving
Picture Experts Group) are both commonly used image and video compression formats,
but they serve different purposes.
JPEG (Joint Photographic Experts Group) and MPEG (Moving Picture Experts Group)
are both digital image and video compression standards developed by their respective
organizations. However, JPEG is primarily designed for compressing still images, while
MPEG is designed for compressing video.
The main difference between JPEG and MPEG lies in their compression techniques.
JPEG uses lossy compression, which means that it discards some of the image data in
order to reduce the file size. This can result in a loss of image quality, particularly when
the compression level is high. On the other hand, MPEG uses both lossy and lossless
compression techniques to achieve high compression ratios while maintaining video
quality.
Another key difference between JPEG and MPEG is the way they handle motion. JPEG
is designed for still images, so it does not take motion into account. MPEG, on the other
hand, is specifically designed for compressing video and can handle both inter-frame
and intra-frame compression to reduce file sizes.
Overall, while JPEG and MPEG are both image compression standards, they are
optimized for different types of media and employ different compression techniques.
16
Ans:Multimedia conferencing is a form of communication that allows multiple
participants to communicate using a combination of different media types, such as
audio, video, text, and images. It can be used for various purposes, including business
meetings, online learning, and remote collaboration.
Multimedia refers to the combination of different media types, such as text, audio, video,
and images, to deliver information. Multimedia is used in various applications, including
entertainment, education, advertising, and communication.
The main difference between multimedia and hypermedia is that multimedia refers to
the combination of different media types, while hypermedia refers to the use of
hyperlinks to navigate between those media types. Multimedia can exist without
hyperlinks, whereas hypermedia always includes hyperlinks as a fundamental feature.
17
3. Buffering Latency: In streaming applications, buffering is used to ensure a
smooth playback experience by preloading and storing a certain amount of
audio data before playback. The buffering introduces a delay to allow
sufficient data to be accumulated, reducing the chances of interruptions or
stutters during playback.
4. Hardware Latency: The hardware components involved in audio processing,
such as audio interfaces, sound cards, and audio drivers, can introduce
latency due to the time it takes for the data to pass through these
components.
Reducing audio latency is essential for maintaining a real-time and immersive audio
experience. It can be achieved through various techniques, including:
It's important to note that achieving extremely low latency can be challenging and may
require specialized hardware, software optimizations, and a well-designed system
architecture. The acceptable level of latency depends on the specific application and
user requirements.
A streaming protocol is a set of rules that define how data communicates from one device
or system to another across the Internet. Video streaming protocols standardized the
method of segmenting a video stream into smaller chunks that are more easily transmitted.
18
1 HTTP Live Streaming (HLS)
Type
● HTTP-Based
Pros
Cons
● Latency: HLS cannot maintain as low of latency as some of the other preferred
protocols, resulting in poor video quality.
● Poor Ingest: HLS isn't the best option for ingest since HLS-compatible encoders are
not accessible or affordable.
2. Dynamic Adaptive Streaming over HTTP (MPEG-DASH):-
Type
● HTTP-Based
Pros
● Adaptability: This protocol leverages ABR to stream at a high video quality over
different Internet speeds and conditions.
● Customization: MPEG-DASH is open-source, enabling users to tailor it to meet
their unique streaming needs.
Cons
3. WebRTC
Type
● Modern Protocol
Pros
19
● Flexibility: Since WebRTC is open-source, it is flexible enough that developers can
customize it to suit their specific streaming requirements.
● Limited Support: The WebRTC video streaming protocol has only recently been
adopted as a web standard. The market has not had much time to adapt, engineers
might encounter compatibility issues with this streaming setup.
Type
● Modern Protocol
Pros
● Security: This protocol features top-notch security and privacy tools that allow
broadcasters to rest assured their streaming content and viewers remain safe.
● Compatibility: SRT is device and operating system agnostic, making it highly
compatible and able to deliver streams to most Internet-enabled devices.
● Low Latency: The SRT streaming protocol features low-latency streaming thanks to
the support from error correction technology.
Cons
● Limited Support: Similar to WebRTC, SRT is still considered futuristic, the larger
streaming industry will need some time to evolve before this video protocol becomes
standardized.
Type
● Internet Protocol
Pros
20
order or pieces are missing, the protocol communicates with the sender to ensure
each piece arrives where it should be.
Cons
● Slow Speed: The reordering and retransmission of the data packet cause TCP to
transmit slowly.
● Heavy Protocol: TCP requires three packets to set up a socket connection before
sending data.
21
22
PAPER 2016
Que 1(a) what is multimedia ? describe various multimedia applications in detail.
MULTIMEDIA
Multimedia is a representation of information in an attractive and interactive manner with the use
of a combination of text, audio, video, graphics and animation. For examples: E-Mail, Yahoo
Messenger, Video Conferencing, and Multimedia Message Service (MMS).
Multimedia as the name suggests is the combination of Multi and Media that is many types of
media (hardware/software) used for communication of information.
Components of Multimedia
Multimedia consists of the following 5 components:
Text
Characters are used to form words, phrases, and paragraphs in the text. Text appears
in all multimedia creations of some kind. The text can be in a variety of fonts and sizes
to match the multimedia software’s professional presentation. Text in multimedia
systems can communicate specific information or serve as a supplement to the
information provided by the other media.
Graphics
Non-text information, such as a sketch, chart, or photograph, is represented digitally.
Graphics add to the appeal of the multimedia application. In many circumstances,
people dislike reading big amounts of material on computers. As a result, pictures are
more frequently used than words to clarify concepts, offer background information, and
so on. Graphics are at the heart of any multimedia presentation. The use of visuals in
multimedia enhances the effectiveness and presentation of the concept
Animations
A sequence of still photographs is being flipped through. It’s a set of visuals that give
the impression of movement. Animation is the process of making a still image appear to
move. A presentation can also be made lighter and more appealing by using animation.
In multimedia applications, the animation is quite popular. The following are some of the
most regularly used animation viewing programs: Fax Viewer, Internet Explorer, etc.
Video
Photographic images that appear to be in full motion and are played back at speeds of
15 to 30 frames per second. The term video refers to a moving image that is
accompanied by sound, such as a television picture. Of course, text can be included in
23
videos, either as captioning for spoken words or as text embedded in an image, as in a
slide presentation. The following programs are widely used to view videos: Real Player,
Window Media Player, etc.
Audio
Any sound, whether it’s music, conversation, or something else. Sound is the most
serious aspect of multimedia, delivering the joy of music, special effects, and other
forms of entertainment. Decibels are a unit of measurement for volume and sound
pressure level. Audio files are used as part of the application context as well as to
enhance interaction. Audio files must occasionally be distributed using plug-in media
players when they appear within online applications and webpages. MP3, WMA, Wave,
MIDI, and RealAudio are examples of audio formats
Applications of Multimedia
Entertainment
The usage of multimedia in films creates a unique auditory and video impression.
Today, multimedia has completely transformed the art of filmmaking around the world.
Multimedia is the only way to achieve difficult effects and actions.
The entertainment sector makes extensive use of multimedia. It’s particularly useful for
creating special effects in films and video games. Interactive games become possible
thanks to the use of multimedia in the gaming business. Video games are more
interesting because of the integrated audio and visual effects.
Business
Marketing, advertising, product demos, presentation, training, networked
communication, etc. are applications of multimedia that are helpful in many businesses.
The audience can quickly understand an idea when multimedia presentations are used.
It gives a simple and effective technique to attract visitors’ attention and effectively
conveys information about numerous products. It’s also utilized to encourage clients to
buy things in business marketing.
Engineering
Multimedia is frequently used by software engineers in computer simulations for military
or industrial training. It’s also used for software interfaces created by creative experts
and software engineers in partnership.
Fine Arts
Digital artist is a new word for these types of artists. Digital painters make digital
paintings, matte paintings, and vector graphics of many varieties using computer
applications.
List some advantages of Multimedia.
24
(i) It is interactive and integrated: The digitization process integrates all of the
numerous mediums. The ability to receive immediate input enhances interactivity.
(ii) It’s quite user-friendly: The user does not use much energy because they can
sit and watch the presentation, read the text, and listen to the audio.
(iii) It is Flexible: Because it is digital, this media can be easily shared. Adapted to
suit various settings and audiences.
(iv) It appeals to a variety of senses: It makes extensive use of the user’s senses
while utilizing multimedia, for example, hearing, observing and conversing
(v) Available for all type of audiences: It can be utilized for a wide range of
audiences, from a single individual to a group of people.
Ques 1(b) Any hardware that can send and receive data, instructions, and information is
referred to as a communications device. A modem is one kind of communication tool
that joins a channel to a sending or receiving device, like a computer. Data is processed
by computers as digital signals.
Types of Communication Device
1. Dial-up Modem: As was previously said, digital signals from a computer must be
converted to analogue signals before being sent over regular telephone lines. This
conversion is carried out by a modem, often known as a dial-up modem, which is a
communications device. The phrase modulation, which turns a digital signal into an
analogue signal, and demodulates, which change an analogue signal into a digital
signal, are combined to form the word modem.
A modem typically takes the shape of an adapter card that you place in an expansion
slot on the motherboard of a computer. A normal telephone cord has two ends: one
plugs into a port on the modem card and the other into a phone outlet.
2. ISDN and DSL Modems: You require a communications device to transmit and
receive the digital signals from the provider if you use ISDN or DSL to access the
Internet. A computer's digital data and information can be transmitted to and received
from an ISDN connection using an ISDN modem. A DSL modem transfers and receives
digital data and information over a DSL line, both from a computer and another device.
Typically, ISDN and DSL modems are external devices that connect to a port on the
system unit at one end and the telephone line at the other.
25
3. Cable Modems: A cable modem transmits and receives digital data. Cable modems
offer a speedier option to dial-up for home users and have speeds comparable to DSL
because more than 110 million households are wired for cable television. Compared to
dial-up modems and ISDN, cable modems can currently transport data at speeds that
are substantially quicker.
4. Wireless Modems: Some mobile users have a wireless modem that connects to the
Internet wirelessly from a laptop computer, a Smartphone, or another portable device.
There are external or built-in wireless modes.ms that come in PC Cards, ExpressCard
modules, and flash memory card formats.
There are many different types of network cards. A desktop computer's network card is
an adapter card with a port where a cable can be connected. Network cards for portable
computers and devices include things like flash cards, PC Cards, ExpressCard Modules,
and USB Network Adapters.There are also network cards that enable wireless data
transmission. Oftentimes, this kind of card - also known as a wireless network card -
has an antenna.
An Ethernet or token ring network card complies with the rules of a specific network
communications standard. The most popular kind of network card is an Ethernet card.
6. Wireless Access Points: A wireless access point is a hub for communications that
enables computers and other devices to communicate wirelessly with one another or
wirelessly with a wired network. For the best signal reception, wireless access points
include high-quality antennas.
26
Q2 a) Explain the multimedia distributed processing model (ChatGPT)
The multimedia distributed processing model refers to a system architecture designed to handle
the processing and distribution of multimedia content across a network of interconnected
devices. This model is particularly useful when dealing with large-scale multimedia applications
that require significant computational resources and real-time delivery.
1. Source Devices: These are the devices that capture or generate multimedia content, such
as cameras, microphones, sensors, or software applications. Source devices produce raw or
encoded multimedia data, which serves as the input for the distributed processing model.
2. Processing Nodes: Processing nodes are the computational units responsible for executing
specific processing tasks on the multimedia data. They can be distributed across multiple
devices, such as personal computers, servers, or specialized hardware accelerators.
Processing nodes may perform various operations, such as encoding, decoding, filtering,
analysis, or transformation of the multimedia content.
4. Middleware: Middleware acts as a software layer that abstracts the underlying hardware and
provides a common communication interface for the distributed processing model. It enables
efficient coordination and collaboration among the processing nodes and facilitates the
management of multimedia data streams. Middleware also handles data synchronization,
resource allocation, and error handling within the distributed system.
5. Destination Devices: Destination devices receive the processed multimedia content and
deliver it to the end users or display it on output devices. These devices can include personal
computers, smartphones, tablets, television screens, projectors, or other display and playback
devices. The processed multimedia data may be transmitted over the network or stored locally
for later retrieval.
1. **Data Capture**: The source devices capture or generate multimedia content, such as
audio, video, or sensor data.
27
2. **Data Encoding**: The raw multimedia data is often encoded into a compressed format to
reduce the size and facilitate efficient transmission over the network. Encoding can be
performed at the source devices or dedicated encoding nodes.
3. **Data Distribution**: The encoded multimedia data is transmitted over the communication
network to the processing nodes responsible for executing specific tasks.
5. **Data Fusion and Analysis**: After processing, the results from different processing nodes
may need to be combined or analyzed to obtain meaningful insights or generate a final
multimedia output.
6. **Data Delivery**: The processed multimedia data is delivered to the destination devices
through the communication network. It can be streamed in real-time or stored locally for later
retrieval.
7. **Presentation**: The destination devices present the processed multimedia content to the
end users through display or playback mechanisms, enabling them to perceive and interact with
the multimedia application.
The multimedia distributed processing model offers several advantages, including increased
scalability, enhanced computational power, improved fault tolerance, and efficient utilization of
resources. It enables the development of complex multimedia applications that require
high-performance processing and real-time delivery across distributed environments.
The various components of multimedia are Text, Audio, Graphics, Video and Animation.
All these components work together to represent information in an effective and easy
manner.
AUDIO:
An audio format refers to the structure, encoding, and organization of audio data in a
digital file. It specifies how audio signals are stored, compressed, and represented,
determining crucial factors such as audio quality, file size, compatibility, and playback
capabilities. Audio formats enable the storage, transmission, and reproduction of sound
in various digital media applications.
28
1. Uncompressed Format
2. Lossy Compressed format
3. Lossless Compressed Format
1. Uncompressed Audio Format:
● PCM –
It stands for Pulse-Code Modulation. It represents raw analog audio signals in
digital form. To convert analog signal into digital signal it has to be recorded at
a particular interval. Hence it has sampling rate and bit rate (bits used to
represent each sample). It is an exact representation of the analog sound and
does not involve compression. It is the most common audio format used in
CDs and DVDs
● WAV –
It stands for Waveform Audio File Format, it was developed by Microsoft and
IBM in 1991. It is just a Windows container for audio formats. That means that
a WAV file can contain compressed audio. Most WAV files contain
uncompressed audio in PCM format. It is just a wrapper. It is compatible with
both Windows and Mac.
● AIFF –
It stands for Audio Interchange File Format. It was developed by Apple for
Mac systems in 1988. Like WAV files, AIFF files can contain multiple kinds of
audio. It contains uncompressed audio in PCM format. It is just a wrapper for
the PCM encoding. It is compatible with both Windows and Mac.
2. Lossy Compressed Format:
It is a form of compression that loses data during the compression process. But the
difference in quality is not noticeable to hear.
● MP3 –
It stands for MPEG-1 Audio Layer 3. It was released in 1993 and became
popular. It is the most popular audio format for music files. Main aim of MP3 is
to remove all those sounds which are not hearable or less noticeable by
human ears. Hence making the size of the music file small. MP3 is like a
universal format which is compatible with almost every device.
● AAC –
It stands for Advanced Audio Coding. It was developed in 1997 after MP3.The
compression algorithm used by AAC is much more complex and advanced
than MP3, so when comparing a particular audio file in MP3 and AAC formats
at the same bitrate, the AAC one will generally have better sound quality. It is
the standard audio compression method used by YouTube, Android, iOS,
iTunes, and PlayStations.
● WMA –
It stands for Windows Media Audio. It was released in 1999.It was designed
29
to remove some of the flaws of the MP3 compression method. In terms of
quality it is better than MP3. But it is not widely used.
3. Lossless compression:
This method reduces file size without any loss in quality. But is not as good as lossy
compression as the size of file compressed to lossy compression is 2 and 3 times more.
● FLAC –
It stands for Free Lossless Audio Codec. It can compress a source file by up
to 50% without losing data. It is most popular in its category and is
open-source.
● ALAC –
It stands for Apple Lossless Audio Codec. It was launched in 2004 and
became free after 2011. It was developed by Apple.
● WMA –
It stands for Windows Media Audio. But it is least efficient in term of
compression and is not open-source. It has limited hardware support.
VIDEO:
Any video file has two components: a container and a codec. The video format is the
container for music, video, subtitles, and other metadata. A codec is a program that
encodes and decodes multimedia data like audio and video.
A video codec encodes and compresses the video, while an audio codec does the
same with sound while making a video. The encoded video and audio are then synced
and saved in a file format media container.
Video Codec
A video codec (the name codec originates from “enCOde / DECode”) is a protocol for
encoding and decoding video. H.264, MPEG-4, and DivX are examples of common
codecs. A well-designed codec has high efficiency or the capacity to maintain quality
while shrinking the file size.
Video Container
The container format specifies how the metadata and data in a file are organized, not
how the video is encoded (which the codec determines). The metadata and
compressed video data encoded with the codec are stored in the container file. The
file’s extension reflects the container format, also referred to as “the format.”.AVI,.MP4,
and .MOV are common container types. Container formats can be combined with a
variety of codecs to determine which devices and programs the file is compatible with.
30
Some of common container format of video files are as follows:
31
Que 3 a)what is multimedia communication?explain any two multimedia
network?
Multimedia communication involves showing information in multiple media formats.
Images, video, audio and text all are part of multimedia communication. A single
instance of multimedia communication does not have to have all four components.
Applications:
● person-to-person communications (e.g. email)
● person-to-system communications (e.g. web-browsing)
Different types of network that are used to provide multimedia communication services
Multimedia networks
http://www.eie.polyu.edu.hk/~enyhchan/mt_intro.pdf
32
Q4 a) Explain the concept of MPEG in detail.
The Moving Picture Experts Group (MPEG) is an alliance of working groups established
jointly by ISO and IEC that sets standards for media coding, including compression
coding of audio, video, graphics, and genomic data; and transmission and file formats
for various applications.
MPEG formats are used in various multimedia systems. The most well known older
MPEG media formats typically use MPEG-1, MPEG-2, and MPEG-4 AVC media coding
and MPEG-2 systems transport streams and program streams. Newer systems typically
use the MPEG base media file format and dynamic streaming (a.k.a. MPEG-DASH).
MPEG-1 (1993): Coding of moving pictures and associated audio for digital storage
media at up to about 1.5 Mbit/s
MPEG-2 (1996): Generic coding of moving pictures and associated audio information
(ISO/IEC 13818). Transport, video and audio standards for broadcast-quality television.
MPEG-2 standard was considerably broader in scope and of wider appeal – supporting
interlacing and high definition. MPEG-2 is considered important because it was chosen
as the compression scheme for over-the-air digital television ATSC, DVB and ISDB,
digital satellite TV services like Dish Network, digital cable television signals, SVCD and
DVD Video.[23] It is also used on Blu-ray Discs, but these normally use MPEG-4 Part
10 or SMPTE VC-1 for high-definition content.
b) Sampling
33
Since an analogue image is continuous not just in its co-ordinates (x axis), but also in its
amplitude (y axis), so the part that deals with the digitizing of co-ordinates is known as
sampling. In digitizing sampling is done on independent variable. In case of equation y =
sin(x), it is done on x variable.
When looking at this image, we can see there are some random variations in the signal
caused by noise. In sampling we reduce this noise by taking samples. It is obvious that
more samples we take, the quality of the image would be more better, the noise would
be more removed and same happens vice versa. However, if you take sampling on the
x axis, the signal is not converted to digital format, unless you take sampling of the
y-axis too which is known as quantization.
Sampling has a relationship with image pixels. The total number of pixels in an image
can be calculated as Pixels = total no of rows * total no of columns. For example, let’s
say we have total of 36 pixels, that means we have a square image of 6X 6. As we
know in sampling, that more samples eventually result in more pixels. So it means that
of our continuous signal, we have taken 36 samples on x axis. That refers to 36 pixels
of this image. Also the number sample is directly equal to the number of sensors on
CCD array.
Here is an example for image sampling and how it can be represented using a graph.
Quantization
34
Quantization is opposite to sampling because it is done on “y axis” while sampling is
done on “x axis”. Quantization is a process of transforming a real valued sampled image
to one taking only a finite number of distinct values. Under quantization process the
amplitude values of the image are digitized. In simple words, when you are quantizing
an image, you are actually dividing a signal into quanta(partitions).
Now let’s see how quantization is done. Here we assign levels to the values generated
by sampling process. In the image showed in sampling explanation, although the
samples has been taken, but they were still spanning vertically to a continuous range of
gray level values. In the image shown below, these vertically ranging values have been
quantized into 5 different levels or partitions. Ranging from 0 black to 4 white. This level
could vary according to the type of image you want.
There is a relationship between Quantization with gray level resolution. The above
quantized image represents 5 different levels of gray and that means the image formed
from this signal, would only have 5 different colors. It would be a black and white image
more or less with some colors of gray.
When we want to improve the quality of image, we can increase the levels assign to the
sampled image. If we increase this level to 256, it means we have a gray scale image.
Whatever the level which we assign is called as the gray level. Most digital IP devices
uses quantization into k equal intervals. If b-bits per pixel are used,
The number of quantization levels should be high enough for human perception of fine
shading details in the image. The occurrence of false contours is the main problem in
image which has been quantized with insufficient brightness levels. Here is an example
for image quantization process.
35
Q6 a} explain the NTSC and PAL video standard
Ans: NTSC (National Television System Committee) and PAL (Phase Alternating Line) are two
different video standards used for analog television broadcasting and playback. Here's an
explanation of each standard:
1. NTSC:
NTSC is the video standard primarily used in North America, parts of South America, Japan,
and some other countries. It was developed by the National Television System Committee.
NTSC video has the following characteristics:
- Frame rate: NTSC operates at a frame rate of 29.97 frames per second (approximately 30
frames per second).
- Resolution: The standard resolution for NTSC is 720 x 480 pixels.
- Color encoding: NTSC uses a composite video signal where color information is encoded into
the luminance signal (brightness) using a process called color subcarrier modulation.
- Aspect ratio: The aspect ratio for NTSC is 4:3, which means the width of the screen is four
units for every three units of height.
2. PAL:
PAL is the video standard used in most of Europe, Australia, Asia, Africa, and some parts of
South America. It stands for Phase Alternating Line. PAL video has the following characteristics:
- Frame rate: PAL operates at a frame rate of 25 frames per second.
- Resolution: The standard resolution for PAL is 720 x 576 pixels.
- Color encoding: PAL uses a different color encoding system called phase alternation, which
helps in reducing color artifacts and provides better color reproduction compared to NTSC.
- Aspect ratio: PAL also has an aspect ratio of 4:3, similar to NTSC.
One notable difference between NTSC and PAL is the frame rate. NTSC has a higher frame
rate than PAL, but PAL offers better color reproduction and is less prone to video artifacts. As a
result, NTSC and PAL videos are not directly compatible, and TVs and video players need to
support both standards to play content from different regions.
With the advent of digital television and the transition to high-definition formats, the NTSC and
PAL standards have become less relevant. Most modern systems now use digital video
standards such as ATSC (Advanced Television Systems Committee) in North America and DVB
(Digital Video Broadcasting) in Europe and other regions. These digital standards offer higher
resolutions, improved image quality, and additional features compared to their analog
predecessors.
36
Q5) (a) What is JPEG. Enumerate its objectives, applications and architecture
Image compression
Image compression is the method of data compression on digital images.
JPEG compression
JPEG stands for Joint photographic experts group. It is the first interanational standard in image
compression. It is widely used today. It could be lossy as well as lossless . But the technique we are
going to discuss here today is lossy compression technique.
JPEG (pronounced JAY-peg) is a graphic image file compressed with lossy compression using the
standard developed by the ISO/IEC Joint Photographic Experts Group. JPEG is one of the most
popular image formats on the internet and is supported by most web browsers and image editing
software.
JPEG images are compressed using a technique called discrete cosine transform (DCT). DCT is a
mathematical algorithm that breaks down an image into a series of cosine functions. These
functions are then compressed using a variety of techniques, such as quantization and Huffman
coding.
The amount of compression that can be applied to a JPEG image depends on the quality setting. The
higher the quality setting, the less compression is applied, and the larger the file size. The lower the
quality setting, the more compression is applied, and the smaller the file size.
37
● Small file size. JPEG images are typically much smaller than uncompressed images,
which make them ideal for storing and transmitting images over the internet.
● Wide support. JPEG is supported by most web browsers and image editing software.
● Good quality. JPEG images can be of good quality, even with a high degree of
compression.
Table comparing the differences between lossy (for example, JPEG) and lossless compression.
● Lossy compression. Some data is lost during the compression process, which can
lead to a loss of quality in the image.
● Not suitable for all images. JPEG is not suitable for all images. For example, it is not a
good choice for images with sharp edges or fine details.
38
Q5)b) explain process of
Speech Recognition
Speech Recognition And Generation
Speech Recognition is the ability to translate a dictation or spoken word to text.
It is achieved by following certain steps and the software responsible for it is known as a ‘Speech
Recognition System’.
SR systems are usually implemented in the form of dictation software and intelligent assistants in personal
computers, smartphones, web browsers and many other devices.
Pronunciation: What the speech engine thinks a word should sound like.
Grammar: It defines the domain, or context, within which the recognition engine works.
Accuracy: The ability of a recognizer can be examined by measuring its accuracy − or how well it recognizes
utterances.
Training: Some speech recognizers have the ability to adapt to a speaker. When the system has this ability, it
may allow training to take place.
Speech generation(synthesis)
Speech synthesis is the artificial production of human speech. A computer system used for
this purpose is called a speech synthesizer, and can be implemented in software or hardware
products. A text-to-speech (TTS) system converts normal language text into speech; other
39
systems render symbolic linguistic representations like phonetic transcriptions into
[1]
speech. The reverse process is speech recognition.
Synthesized speech can be created by concatenating pieces of recorded speech that are
stored in a database. Systems differ in the size of the stored speech units; a system that
stores phones or diphones provides the largest output range, but may lack clarity. For
specific usage domains, the storage of entire words or sentences allows for high-quality
output. Alternatively, a synthesizer can incorporate a model of the vocal tract and other
[2]
human voice characteristics to create a completely "synthetic" voice output.
The quality of a speech synthesizer is judged by its similarity to the human voice and by its
ability to be understood clearly. An intelligible text-to-speech program allows people with
visual impairments or reading disabilities to listen to written words on a home computer.
Many computer operating systems have included speech synthesizers since the early 1990s.
The JBIG standard works by analyzing the input image and identifying regions of similar pixels,
called segments. Each segment is then replaced with a compressed representation that
includes a description of the segment's shape and a set of reference pixels that represent the
segment's grayscale values. The reference pixels are used to determine the color of each pixel
in the segment, based on a mathematical model that takes into account the surrounding pixels.
The JBIG algorithm also uses a technique called arithmetic coding to further compress the data.
Arithmetic coding assigns a unique code to each combination of symbols in the compressed
data, such as pixels or segments, based on their frequency of occurrence. This allows the most
common symbols to be assigned shorter codes, resulting in further compression.
The JBIG standard offers several benefits over other compression techniques for bilevel
images. Firstly, it can achieve high compression ratios while maintaining a high level of image
quality, making it ideal for applications such as document scanning and archiving. Additionally,
JBIG compression is efficient for compressing multiple pages of documents with similar content,
such as faxes or reports.
Overall, the JBIG standard is a powerful and effective compression technique for bilevel images,
offering high compression ratios and excellent image quality. It has been widely adopted in
applications such as fax transmission, document imaging, and digital archiving.
40
a. Need of compression
b. Multimedia conferencing
c. Media stream protocol.
Ans: A. In the realm of multimedia technology, the need for compression has emerged
as an indispensable aspect. Compression refers to the process of reducing the size of
data files while maintaining their perceptual quality. It plays a vital role in various
multimedia applications, such as audio, video, images, and more. Some of the reasons
why compression is crucial in multimedia technology:
Efficient Storage: Multimedia files, such as videos, audio tracks, and high-resolution
images, often occupy a significant amount of storage space. Compression techniques
enable the reduction of file sizes without sacrificing their quality. By compressing
multimedia files, it becomes feasible to store and manage a vast amount of data
efficiently. This benefit is particularly crucial in scenarios where storage resources are
limited or costly.
Device and Platform Compatibility: Multimedia content is consumed across a wide array
of devices, platforms, and media players. Compression facilitates compatibility by
reducing the file size, making it easier to transfer and play multimedia files on various
devices with varying storage capacities and computational capabilities. It ensures that
multimedia content can be accessed and enjoyed across different platforms without
encountering compatibility issues.
Cost Savings: Storage and bandwidth requirements directly influence the cost
associated with multimedia applications. By compressing multimedia files, organizations
can significantly reduce storage expenses and optimize bandwidth usage, leading to
substantial cost savings. This benefit is particularly important for businesses that deal
41
with large volumes of multimedia data, such as media production companies or
streaming platforms.
Cost and Time Savings: By eliminating the need for travel and accommodation,
multimedia conferencing significantly reduces travel expenses and saves time.
Participants can join meetings from their own locations, avoiding the logistical
challenges associated with physical meetings. This benefit is particularly valuable for
global businesses, remote teams, and individuals seeking efficient communication
solutions.
42
Flexibility and Accessibility: Multimedia conferencing offers flexibility and accessibility,
allowing participants to connect from anywhere with an internet connection. This
accessibility breaks down geographical barriers, enabling individuals from different
locations and time zones to participate in meetings and discussions conveniently.
That points toward one important aspect of streaming protocols: both the output device
and the viewer have to support the protocol in order for it to work.
For example, if you’re sending a stream in MPEG-DASH, but the video player on the
device to which you’re streaming doesn’t support MPEG-DASH, your stream won’t
work.
For this reason, standardization is important. There are currently a few major media
streaming protocols in widespread use, which we’ll look at in detail in a moment. Six
common protocols include:
43