You are on page 1of 42

RTP 1

Real-Time Transport Protocol


(RTP)

August 12, 2001


RTP 2

RTP

• protocol goals
• mixers and translators
• control: awareness, QOS feedback
• media adaptation

August 12, 2001


RTP 3

RTP – the big picture

application

media
encapsulation

RTP RTCP

data control
UDP ST-II

IPv4/6

AAL5
Ethernet
ATM

August 12, 2001


RTP 4

RTP = Real-time transport protocol

• only part of puzzle: reservations, OS, . . .


• product of Internet Engineering Task Force, AVT WG
• RFC 1889, 1890 (to be revised)
• initiated by ITU H.323 (conferencing, Internet telephony), RTSP, SIP, . . .
• support for functions, but does not restrict implementation
• compression for low-bandwidth networks: CRTP (RFC 2508)

August 12, 2001


RTP 5

RTP goals

lightweight: specification and implementation


flexible: provide mechanism, don’t dictate algorithms
protocol-neutral: UDP/IP, ST-II, IPX, ATM-AALx, . . .
scalable: unicast, multicast from 2 to O(107 )
separate control/data: some functions may be taken over by conference control
protocol
secure: support for encryption, possibly authentication

August 12, 2001


RTP 6

Data transport – RTP

Real-Time Transport Protocol (RTP) = data + control


data: timing, loss detection, content labeling, talkspurts, encryption
control: (RTCP) ➠ periodic with T ∼ population
• QOS feedback
• membership estimation
• loop detection

August 12, 2001


RTP 7

RTP functions

• segmentation/reassembly done by UDP (or similar)


• resequencing (if needed)
• loss detection for quality estimation, recovery
• intra-media synchronization: remove delay jitter through playout buffer
• intra-media synchronization: drifting sampling clocks
• inter-media synchronization (lip sync between audio and video)
• quality-of-service feedback and rate adaptation
• source identification

August 12, 2001


RTP 8

RTP mixers, translators, . . .

mixer:
• several media stream ➠ one new stream (new encoding)
• mixer: reduced bandwidth networks (dial-up)
• appears as new source, with own identifier
translator:
• single media stream
• may convert encoding
• protocol translation (native ATM ↔ IP), firewall
• all packets: source address = translator address

August 12, 2001


RTP 9

RTP mixers, translators, . . .

end system
SSRC=17
DVI4
192.35.149.52
GSM mixer SSRC=5
SSRC=5 CSRC= 17 39
translator GSM
end system L16
SSRC=39 192.26.8.84 192.20.225.101

128.119.40.186

August 12, 2001


RTP 10

RTP packet header

bit 0 8 16 24 32

V(2)P X CSRC
count M payload type sequence number
timestamp
UDP packet

synchronization source identifier (SSRC)

opt. opt.
contributing source identifiers (CSRC)

header extension
payload (audio,video,...)

opt.
0x00
bytes

August 12, 2001


RTP 11

RTP packet header

Payload type: audio/video encoding method; may change during session


SSRC: sychronization source ➠ sources pick at random
➠ may change after collision!
sequence number: +1 each packet ➠ gaps ≡ loss

P: padding (for encryption) ➠ last byte has padding count


M: marker bit; frame, start of talkspurt ➠ delay adjustment
CC: content source count (for mixers)
CSRC: identifiers of those contributing to (mixed into) packet

August 12, 2001


RTP 12

RTP timestamp

• +1 per sample (e.g., 160 for 20 ms packets @ 8000 Hz)


• random starting value
• different fixed rate for each audio PT
• 90 kHz for video
• several video frames may have same timestamp
• ➠ gaps ≡ silence
• time per packet may vary
• split video frame (carefully. . . ) across packets
• typical: 20 to 100 ms of audio

August 12, 2001


RTP 13

RTP in a network

• typical: UDP, no fixed port; RTCP port = RTP port (even) + 1


• typical UDP size limited to few hundred bytes (OS, network, fragmentation)
• native ATM: directly into AAL5 frame
• encapsulation (length field) for others
• typically: one media (audio, video, . . . ) per port pair
• exception: bundled MPEG

August 12, 2001


RTP 14

RTP control protocol – types

stackable packets, similar to data packets


sender report (SR): bytes send ➠ estimate rate;
timestamp ➠ synchronization
reception reports (RR): number of packets sent and expected ➠ loss, interarrival
jitter, round-trip delay
source description (SDES): name, email, location, . . .
CNAME (canonical name = user@host) identifies user across media
explicit leave (BYE): in addition to time-out
extensions (APP): application-specific (none yet)

August 12, 2001


RTP 15

RTCP packet structure

if encrypted: random 32-bit integer

packet packet packet

chunk chunk
receiver reports item item item item
SSRC

SSRC

SSRC

SSRC

SSRC

SSRC
SSRC
sender site 1 site 2 CNAME PHONE CNAME LOC reason
SR report SDES BYE

compound packet
UDP packet

August 12, 2001


RTP 16

RTCP announcement interval computation

Goals:
• estimate current # & identities of participants – dynamic
• source description (“SDES”) ➠ who’s talking?
• quality-of-service feedback ➠ adjust sender rate
• to O(1000) participants, few % of data
➠ randomized response with rate ↓ as members ↑
• group size limited by tolerable age of status
• gives active senders more bandwidth
• soft state: delete if silent

August 12, 2001


RTP 17

RTCP bandwidth scaling

• every participant: periodically multicast RTCP packet to same group as data


• ➠ everybody knows (eventually) who’s out there
• session bandwidth:
– single audio stream
P
– of concurrently active video streams

August 12, 2001


RTP 18

RTCP bandwidth scaling

• sender period T :
# of senders
T = · avg. RTCP packet size
0.25 · 0.05 · session bw
• receivers:
# of receivers
T = · avg. RTCP packet size
0.75 · 0.05 · session bw
• next packet = last packet + max(5 s, T ) · random(0.5. . . 1.5)
• randomization prevents “bunching”
• to reduce RTCP bandwidth, alternate between SDES components

August 12, 2001


RTP 19

RTCP sender reports (SR)

SSRC of sender: identifies source of data


NTP timestamp: when report was sent
RTP timestamp: corresponding “RTP time” ➠ lip sync
sender’s packet count: total number sent
sender’s octet count: total number sent
followed by zero or more receiver report

August 12, 2001


RTP 20

RTCP receiver reports (RR)

SSRC of source: identifies who’s being reported on


fraction lost: binary fraction
cumulative number of packets lost: long-term loss
highest sequence number received: compare losses, disconnect
interarrival jitter: smoothed interpacket distortion
LSR: time last SR heard
DLSR: delay since last SR

August 12, 2001


RTP 21

Intermedia synchronization

= sync different streams (audio, video, slides, . . . )


• timestamps are offset with random intervals
• may not tick at nominal rate
• SRs correlate “real” time (wallclock time) with RTP ts
560 = 8:45:17.23
audio
0 160 320 480 640 800 960 1120
RTP RTCP SR
RTP timestamp
1800 = 8:45:17.18
video
0 9000
RTCP SR

August 12, 2001


RTP 22

Round-trip delay estimation

compute round-trip delay between data sender and receiver


[10 Nov 2001 11:33:25.125] [10 Nov 2001 11:33:36.5]
n SR(n) A=0xb710:8000 (46864.500 s)

ntp_sec =0xb44db705 dlsr=0x0005.4000 ( 5.250 s)


ntp_frac=0x20000000 lsr =0xb705:2000 (46853.125 s)
(3024992016.125 s)

r RR(n)

DLSR
(5.25 s)
A 0xb710:8000 (46864.500 s)
DLSR −0x0005:4000 ( 5.250 s)
LSR −0xb705:2000 (46853.125 s)

delay 0x 6:2000 ( 6.125 s)

August 12, 2001


RTP 23

RTP: Large groups

How do manage large groups?


• “movie at ten”
• channel surfing
➠ reconsideration: pause and recompute interval
• conditional reconsideration: only if group size estimate increases
• unconditional reconsideration: always
• reverse reconsideration to avoid time-outs

August 12, 2001


RTP 24

BYE floods

• avoid BYE floods: don’t send BYE if no RTCP


• reconsideration
More general:
• general bandwidth sharing problem
• “squeaky wheel” network management

August 12, 2001


RTP 25

Reconsideration: learning curve

Learning Curve
100000

10000

1000
Number

100 L(t) Current


L(t) Conditional Reconsideration
L(t) Unconditional Reconsideration
Ideal

10

1
1 10 100 1000 10000 100000
Time (s)

August 12, 2001


RTP 26

Reconsideration: influence of delay

Cumulative Packets Sent


3000
Uniform
Fixed
Exponential
2500

2000
Number

1500

1000

500

0
0 100 200 300 400 500 600 700 800 900 1000
Time (s)

August 12, 2001


RTP 27

RTP: Aggregation

• interconnected IPTel gateways ➠ several RTP streams to same destination


• high overhead: G.729, 30 ms packetization ➠ 30 bytes audio, 40 bytes IP + UDP
+ RTP headers
• with ATM: efficiency = 28%
• solution: bundle several calls into single RTP session

August 12, 2001


RTP 28

RTP: Aggregation

M PT sequence number

timestamp

SSRC

M PT 0 call ID M PT 0 call ID

M PT 1 call ID length of block

first audio block

second audio block

third audio block

• for 24 channels ➠ efficiency ↑ 89%


• signal call-ID using SIP

August 12, 2001


RTP 29

Collision detection and resolution

Collision:
• two sources may pick the same SSRC (“birthday problem”)
• probability: about 10−4 if 1000 session members join more or less simultaneously
• but: don’t pick one you know about already ➠ probability much lower unless
everyone joins at the same time
• send BYE for old, pick a new identifier

August 12, 2001


RTP 30

Loop detection

• forward packet to same multicast group (directly or through translators)


• looks similar to collision, but changing SSRC doesn’t help
• look at RTCP packets

August 12, 2001


RTP 31

RTP for the masses

• for 14.4 kb/s stream: 90 B/s ≈ 1 new site/s


• takes ≈ 3 hours to get to know 10,000 people ➠
– who cares? (Nielsen!)
– useless for QOS feedback
– control rate too high
• ➠ statistical sample (sender determines rate): send value [0, 1]; pick random
value; if <, lucky winner ➠ needs to be adaptive
• ➠ report just to sender, instead of multicast

August 12, 2001


RTP 32

Adaptive applications

August 12, 2001


RTP 33

Adaptive applications

Multimedia applications can adjust their data rates:


Audio: encoding parameters (MPEG L3), encoding, sampling rate, mono/stereo

encoding sampling rate bit rate


LPC 8,000 5,600
GSM 8,000 13,200
DVI4 8,000 32,000
µ-law 8,000 64,000
DVI4 16,000 64,000
a range of DVI4 and MPEG L3
L16 stereo 44,100 1,411,200

August 12, 2001


RTP 34

Adaptive applications

Video: frame rate, quantization, image resolution, encoding


30000
Noise
Face 1
25000 Face 2
White Picture
Black Picture
20000
Size [Bytes]

15000

10000

5000

0
0 50 100 150 200 250 300
Q-Factor

August 12, 2001


RTP 35

Application control

• networks with QoS guarantees:


– QoS at call set-up, guaranteed
– long call durations ➠ network load may change
– “wrong” guess ➠ rejected calls or low quality
• networks w/o QoS or shared reserved link:
– adapt application to available bandwidth
– share bandwidth fairly with TCP?
– lowest common demoninator ➠ mixers, translators

August 12, 2001


RTP 36

TCP-friendly applications

• avoid race due to FEC, aggressive retransmission


• push aside TCP applications (sometimes ok. . . )
• avoid congestion collapse
• avoid being but in “penalty box”
• time scale?

August 12, 2001


RTP 37

TCP-friendly adaptation

• rate computation (e.g.,):


– use additive-increase, multiplicative-decrease
– use loss/RTT equation: throughput = R1.22
√ , where R is the round-trip time and
p
p ≈ loss fraction
• mechanisms:
– TCP ACKs, without retransmission −→ overhead, no multicast
– RTCP RR −→ delay, metric?

August 12, 2001


RTP 38

RTP: Status and Issues

Compression: differential compression for low-speed point-to-point links ➠


compress IP, UDP, RTP into 1–2 bytes
Aggregation: trunking of packet streams or Internet telephony gateways
Large groups: RTCP feedback for O(10,000); sampling
RTP (RFC 1889, RFC 1890) −→ draft standard

August 12, 2001


RTP 39

RTP Header Compression

• large overhead for IP + UDP + RTP headers: 40 bytes


• CRTP = lossless differential compression that reduces overhead to two bytes on
(low-speed) point-to-point links
• derived from VJ TCP/IP header compression
• context: IP address, port, RTP SSRC
• IP: only packet ID changes
• UDP: only checksum
• RTP: second-order difference of timestamp and sequence number is zero
• resynchronization by NAK −→ not good for high BER, delay

August 12, 2001


RTP 40

RTP Header Compression

• link layer indicates FULL_HEADER, COMPRESSED_UDP, COMPRESSED_RTP,


CONTEXT_STATE (no IP header)
• differences are encoded as variable-length fields:
-16384 C0 00 00
-129 C0 3F 7F
-128 80 00
-1 80 7F
0 00
127 7F
128 80 80
16383 BF FF
16384 C0 40 00
4194303 FF FF FF

August 12, 2001


RTP 41

CRTP Packet Header

0 1 2 3 4 5 6 7
msb of session context ID
(if 16−bit CID)

lsb of session context ID

M
(marker)
S T I link sequence

UDP checksum
(if non−zero in context)

random fields
(if non−zero in context)
[used when IP−in−IP header]

CC
if MSTI = 1111 M’ S’ T’ I’ (CSRC count)

delta IPv4 ID
(if I or I’ = 1)
delta RTP sequence
(if S or S’ = 1)

delta RTP timestamp


(if T or T’ = 1)

CSRC list
(if MSTI = 1111 and CC != 0)

RTP header extension


(if X bit set in context)

RTP data

August 12, 2001


RTP 42

Some RTP Implementations

tool who media RSVP adaptive


NeVoT GMD Fokus audio yes not yet
vic LBNL video no no
vat LBNL audio no no
rat UCL audio no no
Rendezvous INRIA A/V no yes
NetMeeting Microsoft A/V no no
IP/TV Cisco A/V no no
RM G2 Real A/V no yes
http://www.cs.columbia.edu/˜hgs/rtp/

August 12, 2001