You are on page 1of 7

EC8002-Multimedia Compression and Communication Department of ECE

UNIT 5
Multimedia Applications
The Internet supports a large variety of useful and entertaining multimedia applications.
Multimedia applications are categorized into three
(i) Streaming stored audio/video,
(ii) Real-time interactive voice/video-over-IP
(iii) Streaming live audio/video.
Streaming means that a client media player can begin playing the data (such as a movie)
before the entire file has been transmitted.
Streaming Stored Audio and Video
In this class of applications, the prerecorded video, such as a movie, a television
show, a prerecorded sporting event, or a prerecorded user generated videos are placed on
servers, and users send requests to the servers to view the videos on demand. The client can
pause, rewind, fast forward, push slider bar. The initial delay can be 1 sec. The timing
constraint for still-to-be transmitted data should be in time for play out. Many Internet
companies today provide streaming video, including YouTube (Google), Netflix, Amazon,
and Hulu.

Features of Streaming stored video :


• Streaming. means that the client will be playing out from one location in the video while at
the same time receiving later parts of the video from the server. The client typically begins
video playout within a few seconds after it begins receiving the video from the server. It
avoids having to download the entire video file before playout begins.
• Interactivity. Because the media is prerecorded, the user may pause, reposition forward,
reposition backward, fast-forward, and so on through the video content. The time between the
user request and response to the client should be less than a few seconds
Continuous playout. Once playout of the video begins, it should proceed according to the
original timing of the recording. Therefore, data must be received from the server in time for
its playout at the client; otherwise, users experience video frame freezing or frame skipping.
In order to provide continuous playout, the network must provide an average throughput to
the streaming application that is at least as large the bit rate of the video itself.

Real time Interactive Voice-over-IP (VoIP ) and Video-over-IP


Real-time conversational voice and video over IP allow users to create conferences with
three or more participants. with the Internet companies Skype, QQ, and Google Talk
boasting hundreds of millions of daily users.
Timing considerations are important because audio and video conversational applications are
highly delay-sensitive. For voice, delays smaller than 150 milliseconds are not perceived by
a human listener, delays between 150 and 400 milliseconds can be acceptable, and delays
exceeding 400 milliseconds can result in frustrating, if not completely unintelligible, voice
conversations.

St. Joseph’s College of Engineering 1


EC8002-Multimedia Compression and Communication Department of ECE

On the other hand, conversational multimedia applications are loss-tolerant occasional loss
only causes occasional glitches in audio/video playback, and these losses can often be
partially or fully concealed.

Streaming Live Audio and Video


These applications are similar to traditional broadcast radio and television, except that
transmission takes place over the Internet. It allows a user to receive a live radio or television
transmission-such as a live sporting event or an ongoing news event-transmitted from any
corner of the world. Today, thousands of radio and television stations around the world are
broadcasting content over the Internet. Delays of up to ten seconds or so from when the user
chooses to view a live transmission to when playout begins can be tolerated.

Teal time protocols (RTP) for multimedia networking applications.


The sender side of a multimedia application appends header fields to the audio/video chunks
before passing the chunks to the transport layer. These header fields include sequence
numbers and timestamps. RTP has a standardized packet structure used for transporting
common formats such as GSM for sound and MPEG1 and MPEG2 for video.
RTP typically runs on top of UDP. Specifically, audio or video chunks of data, generated by
the sending side of a multimedia application, are encapsulated in RTP packets, and each RTP
packet is in turn encapsulated in a UDP segment. RTP can be viewed as a sublayer of the
transport layer, as shown in Figure

Figure RTP can be viewed as a sublayer of the transport layer


From the application developer's perspective, however, RTP is not part of the transport layer
but instead part of the application layer. This is because the developer must integrate RTP
into the application. Specifically, for the sender side of the application, the developer must
write code into the application which creates the RTP encapsulating packets; the application
then sends the RTP packets into a UDP socket interface. Similarly, at the receiver side of the
application, the RTP packets enter the application through a UDP socket interface; the
developer therefore must write code into the application that extracts the media chunks from
the RTP packets.
As shown in the Figure the four principle packet header fields are the payload type, sequence
number, timestamp and the source identifier.

St. Joseph’s College of Engineering 2


EC8002-Multimedia Compression and Communication Department of ECE

Figure RTP header fields.


Version (V) : 2 bits long. This field identifies the version of RTP. The version defined by this
specification is two (2).
padding (P) : 1 bit long. If the padding bit is set, the packet contains one or more additional
padding octets at the end which are not part of the payload.
extension (X): 1 bit long. If the extension bit is set, the fixed header is followed by exactly
one header extension, with a format defined in Section 5.2.1.
CSRC count (CC): 4 bits long.The CSRC count contains the number of CSRC identifiers
that follow the fixed header.
marker (M): 1 bit long.The interpretation of the marker is defined by a profile. It is intended
to allow significant events such as frame boundaries to be marked in the packet stream.
Payload Type:The payload type field in the RTP packet is seven-bits long. Thus 27 or 128
different payload types can be supported by RTP. For an audio stream, the payload type field
is used to indicate the type of audio encoding (e.g., PCM, adaptive delta modulation, linear
predictive encoding) that is being used. If a sender decides to change the encoding in the
middle of a session, the sender can inform the receiver of the change through this payload
type field. The sender may want to change the encoding in order to increase the audio quality
or to decrease the RTP stream bit rate.
Sequence Number Field :The sequence number field is 16-bits long. The sequence number
increments by one for each RTP packet sent, and may be used by the receiver to detect packet
loss and to restore packet sequence. For example if the receiver side of the application
receives a stream of RTP packets with a gap between sequence numbers 86 and 89, then the
receiver knows that packets 87 and 88 were lost. The receiver can then attempt to conceal the
lost data.
Timestamp Field :The timestamp field is 32 bytes long. It reflects the sampling instant of
the first byte in the RTP data packet. The receiver can use the timestamps in order to remove
packet jitter introduced in the network and to provide synchronous playout at the receiver.
The timestamp is derived from a sampling clock at the sender. As an example, for audio the
timestamp clock increments by one for each sampling period (for example, each 125 usecs
for a 8 KHz sampling clock); if the audio application generates chunks consisting of 160
encoded samples, then the timestamp increases by 160 for each RTP packet when the source
is active. The timestamp clock continues to increase at a constant rate even if the source is
inactive.

St. Joseph’s College of Engineering 3


EC8002-Multimedia Compression and Communication Department of ECE

Synchronization Source Identifier (SSRC) :The SSRC field is 32 bits long. It identifies the
source of the RTP stream. Typically, each stream in a RTP session has a distinct SSRC. The
SSRC is not the IP address of the sender, but instead a number that the source assigns
randomly when the new stream is started. The probability that two streams get assigned the
same SSRC is very small.
Contributing Source Identifier (CSRC):: 0 to 15 items, 32 bits each.The CSRC list
identifies the contributing sources for the payload contained in this packet. The number of
identifiers is given by the CC field. If there are more than 15 contributing sources, only 15
may be identified. CSRC identifiers are inserted by mixers, using the SSRC identifiers of
contributing sources.
RTCP packet types.
(ii) What does a RTCP packet type carry? Explain in detail. (Apr 18)
Real Time Control Protocol (RTCP) is used in combination with RTP in multimedia
networking and in multicasting application. As shown in Figure , RTCP packets are
transmitted by each participant in an RTP session to all other participants in the session. The
RTCP packets are distributed to all the participants using IP multicast. For an RTP session,
typically there is a single multicast address, and all RTP and RTCP packets belonging to the
session use the multicast address. RTP and RTCP packets are distinguished from each other
through the use of distinct port numbers.

Figure RTCP message Transaction


RTCP packets do not encapsulate chunks of audio or video. RTCP packets are sent
periodically that comprise sender and/or receiver reports that announce statistics that can be
useful to the application.
RTCP Packet Types
RTCP packets are ,
Receiver reception reports
Sender report packets
Source description Packets
RTCP packets are stackable i.e they can be concatenated into a single packet. The resulting
packet is then encapsulated into a UDP segment and forwarded into the multicast tree.
Receiver report packets

St. Joseph’s College of Engineering 4


EC8002-Multimedia Compression and Communication Department of ECE

For each RTP stream that a receiver receives as part of a session, the receiver generates a
reception report. The receiver aggregates its reception reports into a single RTCP packet. The
most important fields of which are,
SSRC of the RTP stream for which the reception report is being generated.
The fraction of packets lost within the RTP stream.
Each receiver calculates the number of RTP packets lost divided by the number of
RTP packets sent as part of the stream. If a sender receives reception reports
indicating that the receivers are receiving only a small fraction of the sender's
transmitted packets, the sender can switch to a lower encoding rate, thereby
decreasing the congestion in the network, which may improve the reception rate.
The last sequence number received in the stream of RTP packets.
The interarrival jitter, which is calculated as the average inter arrival time between
successive packets in the RTP stream.
Sender report packets
For each RTP stream that a sender is transmitting, the sender creates and transmits RTCP
sender-report packets. The most important fields of which are:
SSRC of the RTP stream.
Number of packets sent in the stream.
Number of bytes sent in the stream
Timestamp and wall-clock time of the most recently generated RTP packet in the
stream
.
Senders can use the feedback information,
To modify their transmission rates,
To synchronize different media streams within a RTP session For example, consider a
videoconferencing application for which each sender generates two independent RTP
streams, one for video and one for audio. The timestamps in these RTP packets are
tied to the video and audio sampling clocks, and are not tied to the real time.
To synchronize the playout of audio and video at the receiver by associating the
sampling clock to the real-time clock. Receivers can use this association in the RTCP
sender reports to synchronize the playout of audio and video.

Source description packets


For each RTP stream, the sender creates and transmits source-description packets. These
packets provide a mapping between the source identifier (i.e., the SSRC) and the user/host
name.
These packets contain information about the source, such as
• e-mail address of the
sender
• Sender's name
• Application that
generates the RTP stream
SSRC of the associated RTP stream

St. Joseph’s College of Engineering 5


EC8002-Multimedia Compression and Communication Department of ECE

LIMITATIONS OF BEST EFFORT SERVICE


• Packet loss
• End to end delay
• Packet Jitter
Assume that the sender generates bytes at a rate of 8,000 bytes per second; every 20 msecs
the sender gathers these bytes into a chunk. A chunk and a special header are encapsulated in
a UDP segment and a UDP segment is sent every 20 msecs.

Number of bytes in a chunk = (20 msecs) · (8,000 bytes/sec) = 160 bytes,

If there is no problem in the network, each packet reaches the receiver with a constant end-to-
end delay, of 20 msecs. Otherwise, some packets can be lost and most packets will not have
the same end-to-end delay, even in a lightly congested Internet. Thereore, it is essential for
the receiver to find

(1) Time to start a play back of a chunk, and


(2) Mechanism to handle a missing chunk

• Packet Loss
The UDP segment is encapsulated in an IP datagram. As the datagram wanders through the
network, it passes through router buffers while waiting for transmission on outbound links. It
is possible that one or more of the buffers in the path from sender to receiver is full, in which
case the arriving IP datagram may be discarded, never to arrive at the receiving application.
Packet loss rates between 1 and 20 percent can be tolerated, depending on the type of
encoding and loss concealment mechanism at the receiver. If it exceeds 10 to 20 percent
then it is not possible to achieve acceptable quality.

• End-to-End Delay
End-to-end delay is the accumulation of transmission, processing, and queuing delays in
routers; propagation delays in links; and end-system processing delays. For real-time
conversational applications, such as VoIP, end-to-end delays smaller than 150 msecs are not
perceived by a human listener; delays between 150 and 400 msecs can be acceptable but are
not ideal; and delays exceeding 400 msecs can seriously hinder the interactivity in voice
conversations. The receiving side of a VoIP application will typically disregard any packets
that are delayed more than a certain threshold, for example, more than 400 msecs. Thus,
packets that are delayed by more than the threshold are effectively lost.

• Packet Jitter

St. Joseph’s College of Engineering 6


EC8002-Multimedia Compression and Communication Department of ECE

The varying queuing delays that a packet experiences in the network's routers. Because of
these varying delays, the time from when a packet is generated at the source until it is
received at the receiver can fluctuate from packet to packet, This phenomenon is called jitter.

It is shown in Figure

.
Fig Jitter

Jitter can often be removed by using sequence numbers, timestamps, and a playout delay.

St. Joseph’s College of Engineering 7

You might also like