Professional Documents
Culture Documents
1. End Devices: SIP phone, PC/laptop with SIP Client, PDA, mobile phone
2. PSTN Gateways are a type of User Agent
o SIP Proxy Servers : Forward or “proxy” requests on behalf of User Agents, Route requests,
Consult databases: (DNS), Location Server Can be any number
o Location Server : Database of locations of SIP User Agents. Queried by Proxies in routing.
Updated by User Agents by Registration
o DNS Server : SRV (Service) Records used to locate Inbound Proxy Servers
The role of UAC and UAS, as well as Proxy and Redirect Servers, are defined on a transaction-by-
transaction basis.
For example, the User Agent initiating a call acts as a UAC when sending the initial INVITE
request and as a UAS when receiving a BYE request from the callee.
Similarly, the same software can act as a Proxy Server for one request and as a Redirect Server
for the next request.
Proxy, Location, and Registrar Servers are logical entities; implementations may combine them
into a single application.
SIP PROXY
o Intermediary entity that acts as both a server and a client for the purpose of making requests on
behalf of other clients
Request-URI
Via
Record-Route
Route
Max-Forwards
Proxy- Authorization
Perform routing function, i.e., determine to which hop (UA/proxy/redirect) signaling should be relayed
Header fields that can be legitimately modified by proxy servers are: Request-URI, Via, Record-Route,
Route, Max-Forwards, and Proxy- Authorization. If these header fields are not intact end-to-end.
The Request-URI indicates the user or service to which this request is being addressed (i.e. the
destination address)
The Via indicates the path taken by the request and identifies the location where the response is to be
sent (i.e. the source address)
Record-Route to force future requests in the dialog to be routed through the proxy
Route used to force routing for a request through the listed set of proxies
Max-Forward limit the number of hops a request can transit on the way to its destination (i.e. maximum
number of hops allowed)
Proxy-Authorization allows the client to identify itself (or its user) to a proxy that requires authentication
The proxy generate a CANCEL request for all pending client transactions associated with this
response context.
A proxy also generate a CANCEL request for all pending client transactions associated with this
response context when it receives a 6xx response. A pending client transaction is one that has
received a provisional response, but no final response (it is in the proceeding state) and has not
had an associated CANCEL generated for it.
A stateful proxy responds to a CANCEL, rather than simply forwarding a response it would
receive from a downstream element. For that reason, CANCEL is referred to as a "hop-by-hop"
request, since it is responded to at each stateful proxy hop.
When a response is received by an element, it first tries to locate a client transaction matching
the response and perform following processing
6. When necessary, choose the best final response from the response context
The Back-To-Back User Agent (B2BUA) is a SIP based logical entity that can receive and process INVITE
messages as a SIP User Agent Server (UAS). It also acts as a SIP User Agent Client (UAC) that determines
how the request should be answered and how to initiate outbound calls. Unlike a SIP proxy server, the
B2BUA maintains complete call state and participates in all call requests.
NOTE:
B2BUA is Call-Stateful
Framing
In the case of message-oriented transports (such as UDP), if the message has a Content-Length header
field, the message body is assumed to contain that many bytes.
If there are additional bytes in the transport packet beyond the end of the body, they are discarded. If
the transport packet ends before the end of the message body, this is considered an error. If the
message is a response, it must be discarded. If the message is a request, the element generate a 400
(Bad Request) response. If the message has no Content-Length header field, the message body is
assumed to end at the end of the transport packet.
In the case of stream-oriented transports such as TCP, the Content- Length header field indicates the
size of the body. The Content- Length header field must be used with stream oriented transports.
Error Handling
If the transport user asks for a message to be sent over an unreliable transport, and the result is an ICMP
error, the behavior depends on the type of ICMP error. Host, network, port or protocol unreachable
errors, or parameter problem errors cause the transport layer to inform the transport user of a failure in
sending. Source quench and TTL exceeded ICMP errors is ignored.
If the transport user asks for a request to be sent over a reliable transport, and the result is a connection
failure, the transport layer inform the transport user of a failure in sending.
Client Transaction
The client transaction provides its functionality through the maintenance of a state machine.
There are two types of client transaction state machines, depending on the method of the request
passed by the TU. One handles client transactions for INVITE requests. This type of machine is referred
to as an INVITE client transaction. Another type handles client transactions for all requests except INVITE
and ACK. This is referred to as a non-INVITE client transaction. There is no client transaction for ACK. If
the TU wishes to send an ACK, it passes one directly to the transport layer for transmission.
Server Transaction
The server transaction is responsible for the delivery of requests to the TU and the reliable transmission
of responses. It accomplishes this through a state machine. As with the client transactions, the state
machine depends on whether the received request is an INVITE request.
1. Method: RFC 3261 defines six methods:
a. REGISTER
b. INVITE
c. ACK
d. CANCEL
e. BYE
f. OPTIONS
3. SIP-Version: Include the version of SIP in use, and follow [H3.1] (with HTTP replaced by SIP, and
HTTP/1.1 replaced by SIP/2.0)
2. Status-Code: 3-digit integer that indicates the outcome of an attempt to understand and satisfy
a request
3. Reason-Phrase: Short textual description of the Status-Code, intended for the human user
A session invitation consists of one INVITE request which is usually sent to a proxy. The proxy sends
immediately a 100 Trying reply to stop retransmissions and forwards the request further.
All provisional responses generated by callee are sent back to the caller. See 180 Ringing response in the
call flow. The response is generated when callee's phone starts ringing.
A 200 OK is generated once the callee picks up the phone and it is retransmitted by the callee's user
agent until it receives an ACK from the caller. The session is established at this point.
Session termination is accomplished by sending a BYE request within dialog established bye INVITE. BYE
messages are sent directly from one user agent to the other unless a proxy on the path of the INVITE
request indicated that it wishes to stay on the path by using record routing
Party wishing to tear down a session sends a BYE request to the other party involved in the session. The
other party sends a 200 OK response to confirm the BYE and the session is terminated.
The reason for separation of ACK is the importance of delivery of all 200 OK messages. Not only that
they establish a session, but also 200 OK can be generated by multiple entities when a proxy server forks
the request and all of them must be delivered to the calling user agent. Therefore user agents take
responsibility in this case and retransmit 200 OK responses until they receive an ACK.
We have seen in the previous slides what transactions are, that one transaction includes INVITE and it's
responses and another transaction includes BYE and it responses when a session is being torn down. But
those two transactions should be somehow related--both of them belong to the same dialog.
We have seen that CSeq header field is used to order messages, in fact it is used to order messages
within a dialog. The number must be monotonically increased for each message sent within a dialog
otherwise the peer will handle it as out of order request or retransmission. In fact, the CSeq number
identifies a transaction within a dialog because requests and associated responses are called
transaction. This means that only one transaction in each direction can be active within a dialog. One
could also say that a dialog is a sequence of transactions
Dialogs are also used to route the messages between user agents. For example,
Let's suppose that user sip:Bob@wipro.com wants to talk to user sip:Alice@sip.com. He knows SIP
address of the callee (sip:Alice@sip.com) but this address doesn't say anything about current location of
the user--i.e. the caller doesn't know to which host to send the request. Therefore the INVITE request
will be sent to a proxy server.
The request will be sent from proxy to proxy until it reaches one that knows current location of the
callee. This process is called routing. Once the request reaches the callee, the callee's user agent will
create a response that will be sent back to the caller. Callee's user agent will also put Contact header
field into the response which will contain the current location of the user. The original request also
contained Contact header field which means that both user agents know the current location of the
peer.
Because the user agents know location of each other, it is not necessary to send further requests to any
proxy--they can be sent directly from user agent to user agent. That's exactly how dialogs facilitate
routing.
Further messages within a dialog are sent directly from user agent to user agent. This is a significant
performance improvement because proxies do not see all the messages within a dialog, they are used to
route just the first request that establishes the dialog. The direct messages are also delivered with much
smaller latency because a typical proxy usually implements complex routing logic.
Dialog create by INVITE is terminated using a BYE/CANCEL. Similarly, Dialog created by SUBSCRIBE is
terminated when Subscription Terminates using NOTIFY/SUBSCRIBE itself.
Two users can exchange SDP documents via email, or even snail mail, to set up a session.
A session is established if there is a proper exchange of SDP between two parties and this exchange
results in media being exchanged between the parties.
SIP Events describes a method for setting up a SIP dialog. In this case, the dialog is used for the context
of sending NOTIFY messages to the subscribing endpoint. As such, there is no session associated with a
dialog established by a SUBSCRIBE request.
A property of this selection requirement is that a UA will place a different tag into the From header of an
INVITE than it would place into the To header of the response to the same INVITE. This is needed in
order for a UA to invite itself to a session, a common case for "hairpinning" of calls in PSTN gateways.
Similarly, two INVITEs for different calls will have different From tags, and two responses for different
calls will have different To tags.
When a server transaction is constructed for a request, it enters the "Proceeding" state. The server
transaction generate a 100 (Trying) response unless it knows that the TU will generate a provisional or
final response within 200 ms, in which case it may generate a 100 (Trying) response.
If, while in the "Proceeding" state, the TU passes a 2xx response to the server transaction, the server
transaction pass this response to the transport layer for transmission. It is not retransmitted by the
server transaction; retransmissions of 2xx responses are handled by the TU. The server transaction then
transition to the “Initial" state.
While in the "Proceeding" state, if the TU passes a response with status code from 300 to 699 to the
server transaction, the response is passed to the transport layer for transmission, and the state machine
enter the “Failure/Success" state.
If an ACK is received while the server transaction is in the “Failure/Success" state, the server transaction
transition to the "Confirmed" state.
The initial state, "calling", is entered when the TU initiates a new client transaction with an INVITE
request. If an unreliable transport is being used, the client transaction start timer A with a value of T1. If
a reliable transport is being used, the client transaction should not start timer A (Timer A controls
request retransmissions). For any transport, the client transaction start timer B with a value of 64*T1
seconds (Timer B controls transaction timeouts).
If the client transaction is still in the "Calling" state when timer B fires, the client transaction inform the
TU that a timeout has occurred. The client transaction must not generate an ACK. The value of 64*T1 is
equal to the amount of time required to send seven (7) requests in the case of an unreliable transport.
If the client transaction receives a provisional response while in the "Calling" state, it transitions to the
"Proceeding" state. Any further provisional responses passed up to the TU while in the "Proceeding"
state.
When in either the "Calling" or "Proceeding" states, reception of a response with status code from 300-
699 cause the client transaction to transition to "Completed".
When in either the "Calling" or "Proceeding" states, reception of a 2xx response cause the client
transaction to enter the “Initial" state
A user's location-specific address
Location-independent addresses
To detect loop, the proxies insert branch parameters which consists of two parts – the first part is
normal branch parameter generation which is globally unique and the second part is used to detect loop
and spiral.
Loop detection is performed by verifying that, when a request returns to a proxy, the fields (including
any Route, Proxy-Require and Proxy-Authorization header fields) having an impact on the processing of
the request, including the incoming Request-URI and any header fields affecting the request's admission
or routing have not changed. The value placed in this part of the branch parameter reflect all of those
fields. This is to ensure that if the request is routed back to the proxy and one of those fields changes, it
is treated as a spiral and not a loop. A common way to create this value is to compute a cryptographic
hash of the To tag, From tag, Call-ID header field, the Request-URI of the request received (before
translation), the topmost Via header, and the sequence number from the CSeq header field, in addition
to any Proxy-Require and Proxy-Authorization header fields that may be present.
The request method is not included in the calculation of the branch parameter because incase of
CANCEL and ACK for non-2xx responses, the branch parameter is same as that of the request.
1. Session description
v= (protocol version)
s= (session name)
2. Time description
3. Media description
HTTP basic authentication requires the transmission of a username and a matching password embedded
in the header of a HTTP request. Included in a SIP request this user information could be used by a SIP
proxy server or destination user agent to authenticate a SIP client or the previous SIP hop in a proxy
chain. Because the clear text password can be easily sniffed and therefore poses a serious security risk,
the use of HTTP basic authentication has been deprecated by SIP 2.0
Pretty Good Privacy could be potentially used to authenticate and optionally encrypt MIME payloads
contained in SIP messages but SIP 2.0 has deprecated the use of PGP in favor of S/MIME.
Cryptography is an important element of any strategy to address data transmission security
requirements. It is the practical art of converting messages or data into a different form, such that no-
one can read them without having access to the 'key'. The message may be converted using a 'code' (in
which case each character or group of characters is substituted by an alternative one), or a 'cypher' or
'cipher' (in which case the message as a whole is converted, rather than individual characters).
• Symmetric cryptography
• Involves a single, secret key, which both the message-sender and the message-
recipient must have
• Used by the sender to encrypt the message, and by the recipient to decrypt it
• If they are identical, then the message that was received must have
been identical with that which was sent.
• Major difficulty with symmetric schemes is that the secret key has to be
possessed by both parties, and hence has to be transmitted from whomever
creates it to the other party. But if the key is compromised, all of the data
transmission security measures are undermined. The steps taken to provide a
secure mechanism for creating and passing on the secret key are referred to as
'key management'
• Involves two related keys, referred to as a 'key-pair', one of which only the
owner knows (the 'private key') and the other which anyone can know (the
'public key')
• Knowledge of the public key by a third party does not compromise the security
of data transmissions
• Use the AES (Advanced Encryption Standard) 128 bit keys in cipher block chaining (CBC)
mode
Reference : Rescorla, E.K., "SSL and TLS - Designing and Building Secure Systems", 2001.
Full Cone
A computer behind a NAT with IP 10.0.0.1 sending and receiving on port 8000, is mapped to the external
IP:port on the NAT of 202.123.211.25:12345. Anyone
on the Internet can send packets to that IP:port and those packets will be passed on to the client
machine listening on 10.0.0.1:8000.
Restricted Cone
In the case where the client sends out a packet to external computer 1, the NAT maps the client’s
10.0.0.1:8000 to 202.123.211.25:12345, and External 1 can send back packets to that destination.
However, the NAT will block packets coming from External 2, until the client sends out a packet to
External 2’s IP address. Once that is done, both External 1 and External 2 can send packets back to the
client, and they will both have the same mapping through the NAT.
If the client sends to External 1 to port 10101, the NAT will only allow through packets to the client that
come from 222.111.88.2:10101. Again, if the client has sent out packets to multiple IP:port pairs, they
can all respond to the client, and all of them will respond to the same mapped IP:port on the NAT.
Symmetric
If the client sends from 10.0.0.1:8000 to Computer B, it may be mapped as 202.123.211.25:12345,
whereas if the client sends from the same port (10.0.0.1:8000) to a different IP, it is mapped differently
(202.123.211.25:45678). Computer B can only respond to it’s mapping and Computer A can only
respond to it’s mapping.If either one tries to send to the other’s mapped IP:port, those packets will be
dropped.
SIP Signaling Issues
1. SIP Proxy does not communicate back to SIP client on NAT’ed channel
1. IP address & port sent in SIP INVITE/200 OK (SDP) is Private, and not globally routable.
2. The NAT Proxy contacts the RTP Relay and requests it to set up a session.
3. The RTP Relay assigns an available pair of ports to this Call. It responds to the NAT Proxy with
downstream available port in RTP Relay. The NAT Proxy uses this to modify the SDP information
of the received INVITE request.
4. The NAT Proxy forwards the SIP INVITE request with modifi ed SDP (refl ecting the RTP Relay’s
IP:port) on to the Voice Gateway.
5. The Gateway replies (in the 200 OK) with its own SDP information including the port to receive
RTP packets.
6. The NAT Proxy contacts the RTP Relay to supply the IP:port of the gateway (if the gateway was
also behind a symmetric NAT, then the NAT Proxy would instruct the Relay to wait for packets
from the Gateway before setting the IP:port to forward RTP on to the Gateway).
7. The Relay responds to the NAT Proxy with the upstream available RTP Port.
8. The NAT Proxy forwards the response upstream back to the UA after modifying the response
SDP with the IP:port of the RTP Relay.
9. UA begins sending RTP to the IP:port it received in the 200 OK – to the RTP Relay.
10. RTP Relay notes the IP:port that it received the packet from (for the fi rst packet), and passes on
the packet to the IP:port of the gateway.
11. RTP packets proceed from the gateway to the RTP Relay.
12. The RTP Relay forwards those packets to the client (according to IP:port that it saved when it
received the fi rst RTP packet from the client).
When BYE is received by the NAT Proxy, it forwards this information over to the RTP Relay which tears
down the session.
1. The client will always need to send and receive RTP on the same port.
2. This solution will work for all types of NATs, but because of the delay associated with the RTP
Relay (which may be substantial, especially if the RTP Relay is not close to at least one of the
endpoints), it should probably not be used unless a Symmetric NAT is involved. In other NAT
scenarios, modification of the SDP will be sufficient.
3. The client will not hear any voice until the first packet is sent to the RTP Relay. That could cause
problems when receiving a 183 message as part of the call setup, since the gateway at that point
opens a one-way media stream and passes back network announcements over that stream. If
the client has not yet sent its first RTP packet, the RTP relay does not yet know its public IP:port
address.
4. This is just one way of implementing an RTP Relay. There are other possibilities, including
schemes that do not insert themselves into the SIP flow.
ICE Solution
1. On deciding to initiate a SIP voice session the VOIP client starts a local STUN and TURN client to
obtain a mapping.
2. The client now constructs a SIP INVITE message. The INVITE request will use the addresses it has
obtained in the previous STUN/TURN interactions to populate the SDP of the SIP INVITE.
v=0
1. The SDP has been constructed to include all the available addresses that have been assembled.
• The first 'candidate' address contains the two STUN derived addresses for both RTP and
RTCP traffic. This entry has been given the highest priority (1.0) by the client and also
inserted as the default address.
• The second 'candidate' address contains the two TURN derived addresses for both RTP
and RTCP traffic. This entry has been given the second highest priority (0.8).
• The third and final 'candidate' address contains a local interface address that has not
been derived externally. This entry has been given the lowest priority (0.5).
2. The SIP signaling then traverses the NAT and sets up the SIP session. On advertising a candidate
address, the client should have a local STUN server running on each advertised candidate
address. This is for the purpose of responding to incoming connectivity checks.
3. The remote destination will also carry out similar STUN connectivity checks which then allows
media to be streamed to the client behind the NAT using the advertised connections. Two way
audio is now possible between the two clients.
DIAMETER Server
DIAMETER protocol
SIP server
Serves for the purposes of locating the DIAMETER server that contains the user related data
1. SIP User Agent Client (UAC) sends a SIP REGISTER request to SIP server 1, which will receive the
SIP request. We assume that this SIP server may be located, e.g., at the edge of the
administrative home domain.
2. The Diameter client in SIP server 1 will contact its Diameter server by sending a Diameter User-
Authorization-Request (UAR) message to determine if this user is allowed to receive service, and
if so, request the address of a local SIP server capable of handling this user.
3. The Diameter server will answer with a Diameter User-Authorization-Answer (UAA) message
which will indicate either a list of capabilities that SIP server 1 may use to select an appropriate
SIP server (SIP server 2) and/or a SIP or SIPS URI pointing to SIP server 2.
4. SIP server 1 will forward the SIP REGISTER request to an appropriate SIP server (SIP server 2).
5. The Diameter client in SIP server 2 will then request user authentication from the Diameter
server by sending a Diameter Multimedia-Auth-Request (MAR) message.
6. The Diameter server will respond with a Diameter Multimedia-Auth-Answer (MAA) message
with Result-Code AVP set to the value DIAMETER_MULTI_ROUND_AUTH.
7. The Diameter server will also include a challenge, which SIP server 2 will use to map into the
WWW- authentication header in the SIP 401 (Unauthorized) response, which is sent back to SIP
server 1
9. SIP server 1 will receive a next SIP REGISTER request containing the user credentials
10. The Diameter client in SIP server 1 will contact a Diameter server by sending a Diameter UAR
message to determine the SIP server allocated to the user.
11. The Diameter server will send the SIP or SIPS URI of SIP server 2 in a Diameter UAA message
12. SIP server 1 will then forward the SIP REGISTER request to SIP server 2
13. SIP server 2 will extract the credentials from the SIP REGISTER request. The Diameter client in
SIP server 2 will send those credentials in a Diameter MAR message to the Diameter server.
14. At this point, the Diameter server will be able to authenticate the user, and upon success, will
return a Diameter MAA message with the AVP Result-Code set to the value
DIAMETER_SUCCESS.
15. SIP server 2 will then generate a SIP 200 (OK) response which is forwarded to SIP server 1
2. The Diameter client in the SIP server sends a Multimedia-Auth-Request (MAR) message
3. The Diameter server sends a Multimedia-Auth-Answer (MAA) message that includes all the data
necessary for the SIP server to challenge the user, typically with HTTP Digest Authentication
indicated in the MAA message.
4. This data will serve the SIP server to create a SIP 407 (Proxy Authentication Required) response
that contains a challenge.
5. The SIP UA will create a new INVITE request that contains the credentials.
6. The Diameter client in the SIP server will send the credentials to the Diameter server in a new
Diameter MAR message
7. The Diameter server will validate the credentials and authorize the SIP transaction in a Diameter
MAA message
8. The SIP server forwards the SIP INVITE request to its destination as per regular SIP procedures.
9. Eventually, the session setup will be confirmed with a SIP 200 (OK) response
10. That is forwarded to the SIP UA. The session setup is complete.
Edge routers
Implement all mechanisms needed to perform admission control decision and policing function
COPS protocol
SIP server
Q-SIP
Enhanced SIP
1. The call setup starts with a standard SIP INVITE message sent by the caller to the local Q-SIP
server (caller-side Q-SIP server). The message carries the callee URI in the SIP header and the
session specification within the body SDP (media, codecs, source ports, etc).
2. The Q-SIP server decides whether a QoS session has to be started or not. Q-SIP server extracts
the required information from the message, inserts the additional Q-SIP header and the Record-
Route header information (to assure that all the messages for this session will pass through
itself) within the INVITE message. Then the Q-SIP forwards the INVITE message towards the
invited callee;
3. When the Q-SIP server on the callee side (callee-side Q-SIP server) receives an INVITE message
that contains the SIP QoS extensions, it understands that a session with QoS has to be setup.
Therefore it extracts the needed information from the message, removes the Q-SIP extension
and inserts Record-Route header.
4. When the callee responds with a 200 OK message, it is passed back to the last Q-SIP server that
is the Q-SIP server that controls the access network of the callee.
5. At this point the Q-SIP server on the callee side has all the information to request a specific QoS
reservation to the ER on the callee access network for the callee-to-caller traffic flow.
6. When the callee-side Q-SIP receive a positive response for the QoS reservation request, it stores
such QoS information completing the QoS state and sends the extension information for the
callee side within the 200 OK message toward the caller.
7. When the caller-side Q-SIP server receives the 200 OK message with the complete QoS session
indicators, it completes the QoS session setup by performing the QoS request to the ER on the
caller access network for the caller-to-callee traffic flow.
8. If the response is positive, the QoS state is completed, and the 200 OK is forwarded to the caller.
The fundamental difference from the QoS unidirectional reservation mode is that now there is only one
interaction with the QoS provider. In this case when the caller-side Q-SIP receives a 200 OK response
message for a QoS call, it starts a "bidirectional" QoS reservation with the local QoS provider. The callee-
side SIP server still participates to Q-SIP signaling but does not talk with a QoS provider.