You are on page 1of 267

1 2003 Cisco Systems, Inc. All rights reserved.

Session Number
Presenttion!I"
SCTP
A detailed overview of the protocol and a examination of
the socket API
222 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Course Objectives: hat !ou Should "et
#
To come awa# with an understandin$ of the nuts
and bolts of SCTP
#
To know where in the course materials %the SCTP
book and the &'C(s) #ou can find information #ou
ma# need when lookin$ at an SCTP implementation
#
To be able to understand the user interface to SCTP
stacks %e*$* the SCTP sockets API)
#
To know where the updates to the specification %and
book) are %e*$* the I+")
333 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Prere,uisites
#
A basic understandin$ of IP and transport protocols
#
Some knowled$e of TCP ma# be helpful- but is not
strictl# re,uired*
#
illin$ness to put up with en$ineers that are
attemptin$ to teach a tutorial:+.
$$$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Course Strate$#
#
e will first do a detailed look at the actual protocol
mechanisms
#
e will point out reference material alon$ the wa# as
appropriate %i*e* &'C(s and Internet+.rafts etc*)
#
e expect !O/ to ask ,uestions if #ou $et lost*
#
e will cover a lot of $round in a limited time so
hold on to #our seats :+.
%%% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
&eference 0aterials
#
1SCTP reference book2 Stream Control Transmission
Protocol %SCTP): A &eference "uide- &* Stewart and
3* 4ie- Addison+esle#- 5665- IS78 6+569+:59;<+=
#
&'C 5><6: Stream Control Transmission Protocol-
October 5666
#
&'C ??6>: SCTP Checksum Chan$e- September
5665
#
1I+"2 draft+ietf+tsvw$+sctpimp$uide+96: SCTP
Implementer(s "uide
&&& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP Pro$rammin$ &eferences
#
1sockets API2 draft+ietf+tsvw$+sctpsocket+6::
Sockets API @xtensions for SCTP
#
/8I4 8etwork Pro$rammin$- Aolume 9- Third
@dition- Stevens+'enner+&udoff- Addison+esle#-
566=- IS78 6+9?+9=99BB+9
''' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP @xtensions .rafts
#
1P&+SCTP2 &'C ?:B;
#
1Add+IP2 draft+ietf+tsvw$+addip+sctp+6;: SCTP
.#namic Address &econfi$uration
#
1Pkt+.rop2 draft+stewart+sctp+pktdrprep+66: SCTP
Packet .rop &eportin$
#
1Auth2 draft+tuexen+sctp+auth+chunk+66:
Authenticated Chunks for SCTP
((( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Online &eferences
#
http:CCwww*sctp*or$
Also reachable with DTTP over SCTPE
#
http:CCwww*ietf*or$Chtml*chartersCtsvw$+charter*html
All current work on SCTP is done in the I@T' TSA"
#
sctp+impl on mailer*cisco*com
))) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
'eatures of SCTP
#
&eliable data transfer
wCSACF
# Con$estion control and
avoidance
# 0essa$e boundar#
preservation
#
P0T/ discover# and
messa$e fra$mentation
# 0essa$e bundlin$
# 0ulti+homin$ support
# 0ulti+stream support
#
/nordered data deliver#
option
#
Securit# cookie a$ainst
connection flood attack
%S!8 flood)
#
7uilt+in heartbeat
%reachabilit# check)
#
@xtensibilit#
10 10 10 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
IP 0ulti+homin$
#
The followin$ fi$ure depicts a t#pical multi+homed
host* Feep this picture in mind when we discuss
multi+homin$*
NI*1 NI*2 NI*3
1&0.1%.(2.20
1&1.10.(.221
10.1.&1.11
OS
A++*2
A++*1
A++*3
11 11 11 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Of @ndpoints and Associations
#
Two fundamental concepts in SCTP
@ndpoints %communicatin$ parties)
Associations %communication relationships)
#
These two concepts are ke# to understandin$ the
protocol and its basic operation
#
e start b# definin$ an GSCTP Transport AddressH
12 12 12 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
An SCTP Transport Address
#
@ach transport protocol defines a transport level
header
#
The transport level header helps demultiplex data
comin$ to a host to the correct applications
#
Applications in TCP and /.P bind to a GportH which
forms the core method for demultiplexin$ data
13 13 13 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP Transport Address %cont*)
#
SCTP also defined the same b#te positions in its
transport header for the two 9< bit port fields
#
e term the combination of an SCTP port and an IP
address an GSCTP Transport AddressH
#
The IP address in an SCTP Transport Address 0/ST
be a routeable unicast address
i*e* multicast and broadcast addresses are invalid
1$ 1$ 1$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
An SCTP @ndpoint
#
An SCTP endpoint is the lo$ical end of the SCTP
transport protocol + a communicatin$ part#
#
An SCTP endpoint ma# have 0O&@ than one IP
address but it alwa#s has one and only one port
number
#
An application t#picall# will open an SCTP socket
and bind one address- a set of addresses- or all
addresses to that socket
This socket can then be thou$ht of as an SCTP endpoint
1% 1% 1% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP @ndpoints II
#
An SCTP endpoint can be represented as a list of
SCTP transport addresses with the same port:
endpoint I 196*9*=*5- 96*9*B*? : ;62
#
An SCTP transport address can onl# be bound to
one sin$le SCTP endpoint
1& 1& 1& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP @ndpoint III
NI*1 NI*2 NI*3
1&0.1%.(2.20
1&1.10.(.221
10.1.&1.11
A++liction*1
,1&1.10.(.221 - 2223.
1' 1' 1' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP @ndpoint IA
#
Application+9 has bound one IP address of the host
with the port 555?*
#
If a new application is started Application+5- it ma#
le$all# bind 19<6*9B*;5*56 : 555?2 or 196*9*<9*99 :
555?2 or even 19<6*9B*;5*56- 96*9*<9*99 : 555?2
#
The new application will 8OT be able to bind the
existin$ SCTP Transport address that Application+9
has bound I*e: 19<9*96*;*559 : 555?2
1( 1( 1( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP Associations
#
Jike TCP- SCTP is connection+oriented
#
A connection+oriented protocol is one that re,uires
a setup procedure to establish the communication
relationship %and state) between two parties
#
To establish this state- both sides $o throu$h a
specific set of exchan$es
TCP uses a ?+wa# handshake %S!8- S!8CACF- ACF)
SCTP uses a =+wa# handshake %we examine this later)
1) 1) 1) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP Association II
#
In TCP- the communication relationship between two
endpoints is called a GconnectionH
#
In SCTP- this is called an GassociationH this is
because it is a broader concept than a sin$le
connection %i*e* multi+homin$)
#
An SCTP association can be represented as a pair of
SCTP endpoints:
assoc I K 196*9*<9*99 : 555?2- 19<9*96*;*559- 956*9*9*B : ;62L
20 20 20 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP Association III
#
An SCTP endpoint ma# have multiple associations
#
Onl# one association ma# be established between
an# two SCTP endpoints
21 21 21 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Operation of SCTP Associations
#
An SCTP association provides reliable data transfer
of messa$es
#
0essa$es are sent within a stream- which is
identified b# a stream identifier %SI.)
#
0essa$es can be ordered or un+ordered:
@ach ordered messa$e sent within a stream is also
assi$ned a stream se,uence number %SS8)
/nordered messa$es have no SS8 and are delivered with
no respect to orderin$
22 22 22 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP Streams
#
e will discuss further details in .ata Transfer
section later
Sd*/ueue
0o*/ueue
0o*/ueue
Sd*/ueue
23 23 23 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP States I
C12S3"
C224I3!5AI6
C224I3!3C723"
3S6A81IS73"
,0cv INI6.
9en Coo:ie
Send INI6*AC4
,ASS2CIA63.
Crete 6C8
Send INI6
Strt init timer
,rcv INI6*AC4.
Send C224I3*3C72
Sto+ init timer
Strt coo:ie timer
,rcv C224I3*AC4.
Sto+ coo:ie timer
,rcv vlid C224I3*3C72.
Crete 6C8
Send C224I3*AC4
Pa$e ?9 of the SCTP book
2$ 2$ 2$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP States II
3S6A81IS73"
S7;6"25N*
P3N"IN9
S7;6"25N*
P3N"IN9
,S7;6"25N.
Chec: outstnding
dt chun:s
,rcv S7;6"25N.
Chec: outstnding
dt chun:s
,No <ore 2utstnding
dt chun:s.
Send S7;6"25N
Strt shutdo=n timer
N3>6*S1I"3
,No <ore 2utstnding
dt chun:s.
Send S7;6"25N*AC4
Strt shutdo=n* timer
N3>6*S1I"3
Pa$e ?5 of the SCTP book
2% 2% 2% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP States III
?02< S6A63S*2 ?02< S6A63S*2
S7;6"25N*
S3N6
S7;6"25N*
AC4*S3N6
C12S3"
Pa$e ?5 of the SCTP book
,rcv S7;6"25N*AC4.
send S7;6"25N!C2<P363
Sto+ shutdo=n timer
"elete 6C8
,rcv S7;6"25N*C2<P1363.
Sto+ shutdo=n timer
delete 6C8
2& 2& 2& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
3uestions
#
Dere we pause for an# ,uestionsMM
#
.o #ou have an#M
2' 2' 2' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
7its- 7#tes- and Chunks
#
e will now turn our attention to the on+the+wire bits
and b#tes of SCTP
#
An SCTP packet has a common header that appears
in each packet- followed b# one or more chunks
#
SCTP chunks use a self+describin$ Ta$+Jen$th+
Aalue %TJA) format
#
8ote: all fi$ures used are alwa#s ?5+bits wide
2( 2( 2( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP Packet ith IP Deader
SC6P Common 7eder
Chun: 1
Chun: N
...
IP 7eder
2) 2) 2) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP Common Deader
Source Port "estintion Port
@eriAiction 6g
C0C*32c Chec:sum
30 30 30 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP Common Deader 'ields
#
Source and .estination Port: 9<+bit port values
#
Aerification Ta$: ?5+bit random value selected b#
each endpoint in an association durin$ setup
.iscriminates between two successive associations
Protection mechanism a$ainst blind attackers
#
C&C?5c Checksum: ?5+bit C&C coverin$ the entire
SCTP packet %SCTP common header and all chunks)
8ote that &'C ??6> %C&C?5c) supercedes the Adler+?5
checksum defined in &'C 5><6 %SCTP)
31 31 31 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP Chunks
Chun: 6y+e Chun: ?lgs Chun: 1ength
Chun: "t
32 32 32 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP Chunk Deader 'ields
#
Chunk T#pe: ;+bit value indicatin$ the t#pe of chunk
#
Chunk 'la$s: ;+bit fla$s- defined on per chunk t#pe
basis
#
Chunk Jen$th: 9<+bit len$th in b#tes- includin$ the
chunk t#pe- chunk fla$s- and chunk len$th fields*
8ote that chunks are padded to ?5+bit boundaries within an
SCTP packet* An# paddin$ b#tes %6x66) used are 8OT
included in the chunk len$th
33 33 33 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Jist of Chunk T#pes I
#
There are 56 chunk t#pes currentl# defined in SCTP
%includin$ non+&'CCInternet .raft extensions):
%9) .ATA %6x66)
%5) I8ITIATIO8 1I8IT2 %6x69)
%?) I8ITIATIO8+ACF8OJ@."0@8T 1I8IT+ACF2 %6x65)
%=) S@J@CTIA@+ACF8OJ@."0@8T 1SACF2 %6x6?)
%B) D@A&T7@AT %6x6=)
3$ 3$ 3$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Jist of Chunk T#pes II
%<) D@A&T7@AT+ACF8OJ@."0@8T 1D@A&T7@AT+ACF2
%6x6B)
%:) A7O&T %6x6<)
%;) SD/T.O8 %6x6:)
%>) SD/T.O8+ACF8OJ@"0@8T 1SD/T.O8+ACF2
%6x6;)
%96) OP@&ATIO8AJ+@&&O& 1@&&O&2 %6x6>)
%99) COOFI@+@CDO %6x6A)
%95) COOFI@+ACF8OJ@."0@8T 1COOFI@+ACF2 %6x67)
3% 3% 3% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Jist of Chunk T#pes III
%9?) @4PJICIT CO8"@STIO8 8OTI'ICATIO8 @CDO 1@C8@2
%6x6C)
%9=) CO8"@STIO8 I8.O &@./C@ 1C&2 %6x6.)
%9B) SD/T.O8+CO0PJ@T@ %6x6@)
3& 3& 3& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Jist of Chunks T#pes: @xtensions
#
P&+SCTP + &'C ?:B;
%9<) 'O&A&.+TS8 %6xC6)
# A..+IP draft
%9:) A..&@SS+CO8'I"/&ATIO8 1ASCO8'2 %6xC9)
%9;) A..&@SS+CO8'I"/&ATIO8+ACF8OJ@."0@8T
1ASCO8'+ACF2 %6x;6)
#
Packet+.rop draft
%9>) SCTP+PACF@T+.&OP+&@PO&T 1PFT+.&OP2 %6x;9)
# Authentication draft
%56) A/TD@8TICATIO8 1A/TD2 %6x;5) + about to under$o drastic
chan$es and will probabl# add 5+? chunks*
3' 3' 3' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
"eneral Chunk Processin$
#
In an# SCTP packet- control chunks alwa#s come
before .ATA chunks
#
Some chunks must be sin$letons: I8IT or I8IT+ACF
#
Chunk t#pe number assi$nments are not linear
#
The chunk t#pe upper two bits have specific
meanin$s used for processin$ unreco$niNed chunks
3( 3( 3( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Chunk T#pe Processin$
#
A bit pattern of 66xxxxxx in the chunk t#pe indicates
that if this chunk is unknown b# the receiver- silentl#
drop it and stop processin$ the rest of the packet
#
A bit pattern of 69xxxxxx in the chunk t#pe indicates
that if this chunk is unknown b# the receiver- drop it-
send an @&&O& chunk in repl#- and stop processin$
the rest of the packet
3) 3) 3) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Chunk T#pe Processin$ II
#
A bit pattern of 96xxxxxx in the chunk t#pe indicates
that if this chunk is unknown b# the receiver- silentl#
skip this chunk but continue to process the rest of
the chunks in the packet
#
A bit pattern of 99xxxxxx in the chunk t#pe indicates
that if this chunk is unknown b# the receiver- skip
this chunk but send an @&&O& chunk in repl# and
continue to process the rest of the chunks in the
packet
$0 $0 $0 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Pop 3uiN
#
To see if #ou are pa#in$ attention:
Assume #ou have an SCTP implementation that
understands 8O8@ of the extensions mentioned earlier*
#
hat will the implementation do with:
+ 'O&A&.+TS8 %6xC6)
+ ASCO8' %6xC9)
+ ASCO8'+ACF %6x;6)
+ PFT+.&OP %6x;9)
+ A/TD@8TICATIO8 %6x;5)
$1 $1 $1 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP Chunk Parameters
#
Some chunks have parameters within them
@xamples: I8IT- I8IT+ACF- D@A&T7@AT
#
A parameter also has a TJA format
#
A parameter has a similar format to a chunk but
sli$htl# different %see the next slide)*
#
Processin$ rules for unknown parameters are
similar to those for the chunk processin$ with
sli$htl# different connotations
$2 $2 $2 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Parameters 'ormat
6y+e B 0C0033 1ength B (
$ 2ctets oA "t
8ote 9< bit Parameter T#pe
8ote 9< bit len$th
Includin$ the header
The Aariable Jen$th .ata $oes here
$3 $3 $3 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Parameter Dandlin$ &ules I
#
The upper 5 bits of the 9< bit parameter is a$ain
used to tell an implementation what to do with an
unknown parameter
66xxxxxx+xxxxxxxx : indicates to stop processin$ the
parameter and silentl# discard this chunk
69xxxxxx+xxxxxxxx : indicates to stop processin$ the
parameter- report this in an @&&O& %or I8IT+ACF) chunk-
and discard this chunk
$$ $$ $$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Parameter Dandlin$ &ules II
96xxxxxx+xxxxxxxx : indicates silentl# skip this parameter-
and continue processin$ the rest of this chunk
99xxxxxx+xxxxxxxx : indicates skip this parameter- report
this in an @&&O& %or I8IT+ACF) chunk- and continue
processin$ the rest of this chunk
#
8ote that no matter what results from processin$
each individual parameter- the rest of the chunks in
the packet are alwa#s processed
$% $% $% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Chunk .etails
#
e now turn our attention to the individual chunk
details*
#
e will examine each chunk in the order it would
appear in a t#pical association setup- data exchan$e
and shutdown*
#
@xtension chunks will be left up to the reader to
explore in the individual drafts*
$& $& $& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
I8IT Chunk
6y+eB1 ?lgsB0 1engthBvrible
Initition 6g
D 2ut Strems <C D In Strems
2+tionlE@rible length +rmeters
0eceiver =indo= credit
Initil 6SN
$' $' $' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
I8IT %and I8IT+ACF) Chunk 'ields
#
Initiation Ta$: non+Nero random ?5+bit nonce value
#
&eceiver indow Credit: initial rwnd used for flow
control
#
O of Outbound Streams: number of streams the
sender wishes to use
#
0ax O of Inbound Streams: maximum number of
streams the sender supports
#
Initial TS8: initial ?5+bit TS8 used for data transfer
which is also a random value %it ma# be copied from
the initiation ta$)
$( $( $( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
I8IT C I8IT+ACF Chunk Summar#
#
I8IT C I8IT+ACF chunks have fixed and variable parts
#
The variable part is made up of parameters
#
The parameters specif# options and features
supported b# the sender
#
0ost parameters are valid for both the I8IT and the
I8IT+ACF
$) $) $) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
I8IT+ACF Chunk
6y+eB2 ?lgsB0 1engthBvrible
Initition 6g
D 2ut Strems <C D In Strems
2+tionlE@rible length +rmeters
0eceiver =indo= credit
Initil 6SN
%0 %0 %0 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
I8IT and I8IT+ACF Parameters
F3S F3S 0>C00& Ad+tion 1yer Indiction
F3S F3S 0CC00$ Set Primry Address
F3S F3S 0CC001 P0*SC6P Su++orted
F3S N2 0C000' Stte Coo:ie
F3S N2 0C000( ;nrecogniGed Prmeters
F3S F3S 0C000C Su++orted Address 6y+es
F3S F3S 0C0008 7ostnme Address
F3S F3S 0C(000 3CN C+ble
N2 F3S 0C000) Coo:ie Preservtive
F3S F3S 0C000& IPv& Address
F3S F3S 0C000% IPv$ Address
I8IT+ACF I8IT T!P@ PA&A0@T@&
%1 %1 %1 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Cookie @cho Chunk
6y+eB ?lgsB0 1engthBvrible
Stte Coo:ie Arom INI6*AC4
%2 %2 %2 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Cookie Ack Chunk
#
The Cookie+@cho and Cookie+ACF are simplistic
chunks- but help prevent resource attacks
#
The# serve as the last part of the =+wa# handshake
that sets up an SCTP association
#
7oth allow bundlin$ with other chunks- such as
.ATA
6y+eBb ?lgsB0 1engthB$
%3 %3 %3 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
.ATA Chunk
# 'la$ 7its P/7@Q are used to indicate:
/ R /nordered .ata
7 R 7e$innin$ of 'ra$mented 0essa$e
@ R @nd of 'ra$mented 0essa$e
# A user messa$e that fits in one chunk would have both the 7 and @
bits set
6y+eB0C00 ?lgsB;83 1engthBvrible
6SN @lue
Strem IdentiAier Strem Se/uence Num
@rible 1ength ;ser "t
Pylod Protocol IdentiAier
%$ %$ %$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
.ATA Chunk 'ields
#
TS8: transmission se,uence number used for
orderin$ and reassembl# and retransmission
#
Stream Identifier: the stream number for this .ATA
#
Stream Se,uence 8umber: identifies which
messa$e this .ATA belon$s to for this stream
#
Pa#load Protocol Identifier: opa,ue value used b#
the endpoints %and perhaps network e,uipment)
#
/ser .ata: the user messa$e %or portion of)
%% %% %% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SACF Chunk
6y+eB3 ?lgsB0 1engthBvrible
Cumultive 6SN
0eceiver =indo= credit
9+ Ac: 8l: D1 strt
9+ Ac: 8l: DN strt
Num oA "u+B< Num oA ?rgmentsBN
9+ Ac: 8l: D1 end
9+ Ac: 8l: DN end
"u+licte 6SN D1
"u+licte 6SN D<
%& %& %& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SACF Chunk 'ields
#
Cumulative TS8 Acknowled$ment: the hi$hest
consecutive TS8 that the SACF sender has received
a*k*a* cumulative ack %cum+ack) point
#
&eceiver indow Credit: current rwnd available for
the peer to send
#
O of 'ra$ments: number of "ap Ack 7locks included
#
O of .uplicates: number of .uplicate TS8 reports
included
%' %' %' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SACF Chunk 'ields II
#
"ap Ack 7lock Start C @nd TS8 offset: the start and
end offset for a ran$e of consecutive TS8s received
relative to the cumulative ack point
The TS8s not covered b# a "ap Ack 7lock indicate TS8s
that are Gmissin$H
#
.uplicate TS8: TS8 that has been received more
than once
8ote that the same TS8 ma# be reported more than once
%( %( %( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SACF Chunk @xample
6y+eB3 ?lgsB0 1engthBvrible
Cum Ac:B10))&%
r=nd B &$200
9+ strt B 2
9+ strt B '
Num oA "u+B2 Num oA ?rgmentsB2
9+ end B %
9+ end B )
"u+licte 6SN B 10))&3
"u+licte 6SN B 10))&$
%) %) %) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SACF @xample .issected
#
The senderQs cum+ack point is 96>-><B
#
The sender has received TS8(s 96>-><: R 96>->:6
#
The sender has received TS8(s 96>->:5 R 96>->:=
#
The sender is missin$ 96>-><< and 96>->:9*
#
The sender received duplicate transmissions of
96>-><? and 96>-><=
#
3uestion: ould #ou ever see a "ap Ack start of 9M
&0 &0 &0 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Deartbeat Chunk
#
.ata within the Deartbeat .ata parameter is
implementation specific
6y+eB$ ?lgsB0
1engthBvrible
1engthBvrible
Prm 6y+e B 1
7ertbet "t
&1 &1 &1 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Deartbeat Ack Chunk
#
.ata within the Deartbeat .ata parameter is
implementation specific and is a strai$ht echo of
what was received in the Deartbeat chunk
6y+eB% ?lgsB0
1engthBvrible
1engthBvrible
Prm 6y+e B 1
7ertbet "t
&2 &2 &2 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Shutdown Chunks
6y+eB' ?lgsB0 1engthB(
Cumultive 6SN
6y+eB( ?lgsB0 1engthB$
6y+eB1$ ?lgsB6 1engthB$
SD/T.O8
SD/T.O8+ACF
SD/T.O8+CO0PJ@T@
&3 &3 &3 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Shutdown Chunk 'ields
#
The SD/T.O8 chunk also carries a Cumulative
TS8 Acknowled$ment field to indicate the hi$hest
TS8 that the SD/T.O8 sender has seen*
#
A SACF chunk ma# be bundled to $ive a more
complete picture %e*$* "ap Ack 7locks) of the
senderQs receive state*
&$ &$ &$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Operational @rror Chunk
6y+eB)h ?lgsB0
1engthBvrible
1engthBvrible
3rror CuseBCCCC
3rror Cuse
2ne or
<ore
3rror Cuses
&% &% &% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Summar# of @rror Causes
0C0013 Protocol @ioltion
0C0012 ;ser Initited Abort
0C0011 0estrt oA Assocition 5ith Ne= Addresses
0C0010 Coo:ie 0eceived 5hile Shutting "o=n
0C000) No ;ser "t
0C000( ;nrecogniGed Prmeter 6y+e
0C000' Invlid <ndtory Prmeter
0C000& ;nrecogniGed Chun: 6y+e
0C000% ;nresolvble Address
0C000$ 2ut oA 0esource
0C0003 Stle Coo:ie 3rror
0C0002 <issing <ndtory Prmeter
0C0001 Invlid Strem IdentiAier
T#pe Aalue @rror Cause
&& && && 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Abort Chunk
6y+eB& ?lgsB6
1engthBvrible
1engthBvrible
3rror CuseBCCCC
3rror Cuse
Hero or
<ore
3rror Cuses
&' &' &' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
The T+bit
#
7oth the SD/T.O8+CO0PJ@T@ and A7O&T
chunk use one fla$ value
#
The T bit is the first bit: i*e*: binar# +++++++x
#
hen this bit is set to 6- the sender has a TC7 and
the A+Ta$ %in the common header) is the correct one
for the association*
#
hen this bit is set to 9- the sender has 8O TC7 and
the A+Ta$ is set to what was in the A+Ta$ value of the
packet that is bein$ responded to*
&( &( &( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Strem Id 1
'orward+TS8 Chunk
Strem Se/ 1
?lgsB0 1engthBvrible
Ne= Cumultive 6SN
Strem Se/ N Strem Id N
6y+eB1)2
&) &) &) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
'orward+TS8 Chunk 'ields
#
8ew Cumulative TS8: the new cumulative ack point
that the receiver should move forward %skip) to
Treat all TS8s up to this new point as havin$ been received
#
Stream IdentifierCStream Se,uence 8umber: the
lar$est stream se,uence number bein$ skipped for a
$iven stream
#
0ultiple Stream Identifier+Se,uence 8umber pairs
ma# be included if the 'orward TS8 covers multiple
messa$es
'0 '0 '0 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
'orward TS8 Operation
#
/sed to move the cumulative ack point forward
without retransmittin$ data*
8ote the receiver could move the point forward further if
the 'orward TS8 skips past a missin$ block of TS8s
#
Das Nero or more stream and se,uence numbers
listed to help a receiver free stranded data*
#
Is part of the soon to be &'C(d P&+SCTP document*
'1 '1 '1 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Other @xtensions
#
Several SCTP extensions exist
#
Packet .rop is a Cisco ori$inated extension that
inter+works the router with the endpoint*
#
A..+IP allows for d#namic addition and subtraction
of IP addresses
#
A/TD allows for two endpoints to ne$otiate the
si$nin$ of specific chunks %such as A..+IP chunks)*
It uses the Purpose 7uilt Fe#(s %P7F) draft
'2 '2 '2 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Parameters and @rror Causes
#
&'C 5><6 la#s out all the basic data formats
#
The SCTP book on pa$es =:+BB also hold
illustrations of the various chunk la#outs and
details*
#
@rror causes are also in the &'C and can also be
found on pa$es <B+:? of the SCTP book
#
The SCTP Implementors "uide %draft) contains a few
new parameters mentioned previousl#
#
e will let #our curiosit# $uide #ou in viewin$ these
bits and b#tes if #our interested
'3 '3 '3 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
3uestions
#
3uestions before we break
#
In the next sections- we will be$in $oin$ throu$h the
protocol operation details
'$ '$ '$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Settin$ /p an Association
3nd+oint A 3nd+oint H
I8IT
I8IT+ACF
COOFI@+@CDO
COOFI@+ACF
I
I
I ** ;ser dt cn be ttched
Assocition
Is ;+
Assocition
Is ;+
'% '% '% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP Association Setup
#
SCTP uses a four+wa# handshake to set up an
association
#
The side doin$ the active %or implicit) open will
formulate and send an I8IT chunk
#
The sender of the I8IT includes various parameters:
IPv= and IPv< address parameters identif#in$ all bound
addresses within the peerQs scope
@xtensions such as P&+SCTP- Adaption Ja#er Indication
and possibl# a Supported Address list
There could also be cookie preservatives and other sundr#
items as well
'& '& '& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Sendin$ an I8IT
#
Two important random values that a sender of an
I8IT %and an I8IT+ACF) $enerates:
A Aerification Ta$ %A+Ta$) will provide the peer with a nonce
that must be present in ever# packet sent %this is placed in
the initiate ta$ field)
An Initial TS8 provides the startin$ point for the transport
se,uence space
#
The A+Ta$ provides modest securit# for the
association and also removes the need for a
psuedo+header in the checksum
'' '' '' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
The I8IT is in 'li$ht
3nd+oint A 3nd+oint H
I8IT
'( '( '( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
&eceivin$ an I8IT
#
The receiver of the I8IT will validate that a listener
exists for the destination port* If not- it will send an
A7O&T back to the sender*
#
It ma# do some checkin$ and validation- but in
$eneral it will alwa#s send back an I8IT+ACF savin$
8O state* This prevents SCTP from bein$ subject to
the TCP S!8+like attacks*
#
In formulatin$ an I8IT+ACF- the responder will
include all the various parameters just like what a
sender does when formulatin$ an I8IT- but with one
important addition*
') ') ') 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
'ormulatin$ the I8IT+ACF &esponse
#
The receiver of the I8IT 0/ST include a state cookie
parameter in the I8IT+ACF response*
#
The state cookie parameter:
Is si$ned %usuall# with 0.B or SDA+9)
Contains AJJ the state needed to setup the association
%usuall# the entire I8IT and some pieces of the I8IT+ACF)
Is implementation specific- but must include a timestamp
#
Pa$e ;<+;; of the SCTP reference book $oes into
more details of state cookie $eneration
(0 (0 (0 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
7ack "oes the I8IT+ACF
3nd+oint A 3nd+oint H
I8IT
I8IT+ACF
(1 (1 (1 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
hen the I8IT+ACF ArrivesS
#
The receiver of the I8IT+ACF must take special care
in findin$ the association for the endpoint that sent
the I8IT*
#
In particular it must look at the address list inside
the I8IT+ACF in case the source address is not the
same as where the I8IT was sent*
#
After findin$ the association- the receiver will add all
of the peerQs information %addresses- A+Ta$- initial
se,uence number- etc*) to the local TC7*
(2 (2 (2 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0ore on Processin$ the I8IT+ACF
#
At this point the receiver must repl# back with a
COOFI@+@CDO chunk:
The cookie is retrieved b# simpl# findin$ the state+cookie
parameter and chan$in$ the first two b#tes into the chunk
t#pe and fla$s field %set to 6) of the COOFI@+@CDO chunk*
This chunk is sent back to the source address of the I8IT+
ACF packet*
As lon$ as the COOFI@+@CDO chunk is first in the packet-
an# ,ueued .ATA chunks ma# be bundled in the SCTP
packet*
(3 (3 (3 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
'eed the Peer a Cookie
3nd+oint A 3nd+oint H
I8IT
I8IT+ACF
COOFI@+@CDO
I
I ** ;ser dt cn be ttched
($ ($ ($ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Packet with the COOFI@+@CDO
Source Port "estintion Port
@eriAiction 6g
Chec:sum
6y+eB0C Chun: 1engthBN ?lgsB0 6y+eB0C0A Chun: 1engthBN
Coo:ie "t JN K $ bytesL
?lgsB0
6y+eB0C Chun: 1engthBN ?lgsB0 6y+eB0C0 Chun: 1engthB4
;ser "t J4 K 1& bytesL
?lgsB03
Strem Number B < Strem Se/uence B 0
6SN B >
Pylod Protocol I" B A
(% (% (% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Processin$ the Cookie+@cho
#
'irst- validate that the state cookie has not been
modified b# runnin$ the hash over it and the internal
secret ke#* If the# do not match- the cookie is
silentl# discarded*
#
8ext- the timestamp field in the cookie is checked* If
it proves to be an old cookie- a stale cookie error is
sent to the peer*
#
Otherwise- the cookie is used to create a new TC7*
#
The association now enters the @STA7JISD@. state*
(& (& (& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0ore on Cookie Processin$
#
8ote that this ,uick summar# assumes a normal
non+collision- non+restart case* Collision cases are
accounted for in the specification*
#
After the cookie is processed and the TC7 is
created- the endpoint then processes an# additional
chunks contained in the packet*
#
8ote that the additional chunks are processed in the
@STA7JISD@. state- since the cookie processin$
was completed*
(' (' (' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Acknowled$e the @aten Cookie
#
After the packet with the COOFI@+@CDO is full#
processed- a COOFI@+ACF response is sent back*
#
At this point- an# other chunks %.ATA- SACF- etc)
can also be bundled with the COOFI@+ACF*
#
One final interestin$ note- most implementations will
include within the state cookie the address to which
the I8IT+ACF was sent* This is due to the fact that
this address will be the onl# one that is considered
GconfirmedH initiall#*
(( (( (( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Association Completed
3nd+oint A 3nd+oint H
I8IT
I8IT+ACF
COOFI@+@CDO
COOFI@+ACF
I
I
I ** ;ser dt cn be ttched
Assocition
Is ;+
Assocition
Is ;+
() () () 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Other Association Setup Issues to Consider
#
The SCTP book contains additional material
re$ardin$ I8IT and I8IT+ACF chunks*
#
A lar$e set of special case handlin$ is described in
section =*: %pa$es 96? R 955) of the SCTP reference
book* These cases deal with collisions and restarts*
#
e will walk throu$h the restart case %=*:*=) and
discuss tie+ta$s briefl#*
#
&efer to the SCTP book for details on all of the other
cases %it is the onl# place that such collisions are
documented to m# knowled$e)*
)0 )0 )0 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Association &estart
#
An association restart occurs when a peer crashes
and restarts rapidl#*
#
The restart and association re+establish attempt
must occur before the non+restartin$ peerQs
D@A&T7@AT is sent*
%D@A&T7@ATQs are discussed later)
#
e start our scenario with the followin$ picture:
)1 )1 )1 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
&estart: Initial Condition
3nd+oint*A 3nd+oint*H
@6!1BA
@6!PBH
@6!1BA
@6!PBH
@6!1BH
@6!PBA
3S6A81IS73"
)2 )2 )2 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
&estart: Initial Condition .escription
#
Peers @ndpoint+A and @ndpoint+T have their
association in the @STA7JISD@. state*
#
ATUJ %Aerification Ta$ Jocal) is the value that the
endpoint expects in each A+Ta$ for each received
packet*
#
ATUP %Aerification Ta$ Peer) is the value that each
endpoint will send as the A+Ta$ in ever# packet*
#
So- if @ndpoint+A sends a packet to @ndpoint+T- it
places GTH in the A+Ta$ field of the common header*
)3 )3 )3 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
&estart: The C&ASD
3nd+oint*A 3nd+oint*H
@6!1BA
@6!PBH
@6!1BA
@6!PBH
@6!1BH
@6!PBA
C&ASD
INI6 J6gBHCL
3S6A81IS73"
@6!1BHC
)$ )$ )$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
&estart: The Crash .escribed
#
@ndpoint+T suddenl# crashes and restarts*
#
After the application restarts- it %re+)attempts to
setup an association with @ndpoint+A usin$ the
same local SCTP transport addresses
#
@ndpoint+T chooses a new random ta$ GTxH and
sends off a new I8IT to its PpotentialQ peer
&emember- @ndpoint+TQs SCTP stack is un+aware of the
previous association
)% )% )% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
&estart: DmmS A 8ew AssociationM
3nd+oint*A 3nd+oint*H
@6!1BA
@6!PBH
@6!1BA
@6!PBH
@6!1BH
@6!PBA
C&ASD
INI6 J6gBHCL
INI6*AC4J6gBAC, Coo:ieJ66JAyEHyLLL
3S6A81IS73"
@6!1BHC
@6!PBAC
@6!1BHC
)& )& )& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
&estart: Dandlin$ the 8ew I8IT
#
@ndpoint+A receives the new I8IT from its peer out
of the blue*
#
@ndpoint+A cannot necessaril# trust this I8IT since
the A+Ta$ it expects in ever# packet is 8OT present
%since @ndpoint+T restarted)*
#
@ndpoint+A will respond with an I8IT+ACF with:
A new random verification ta$ %Ax)
Two new random Tie+Ta$s %A# and T#) sent in the state
cookie %and also stored in the TC7)
)' )' )' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
&estart: @ver#thin$ 8ormal %Sort+of)
3nd+oint*A 3nd+oint*H
@6!1BA
@6!PBH
@6!1BA
@6!PBH
@6!1BH
@6!PBA
C&ASD
INI6 J6gBHCL
INI6*AC4J6gBAC, Coo:ieJ66JAyEHyLLL
C224I3*3C72J6gBACMCoo:ieL
3S6A81IS73"
@6!1BHC
@6!PBAC
@6!1BHC
)( )( )( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
&estart: Tie+Ta$s
#
&'C5><6 and the SCTP reference book will instruct
that the old A+Ta$s be used as the Tie+Ta$s*
#
The most recent I+" has chan$ed this so that A+Ta$s
are never revealed on the wire except durin$ their
initial exchan$e* %Tie+Ta$s now are basicall# ?5 bit
random nonces that represent the TC7)*
This new chan$e in the I+" adds extra securit# for a
minimal additional TC7 stora$e cost*
#
The restartin$ peer considers ever#thin$ normal
when the I8IT+ACF arrives and sends off the
COOFI@+@CDO which holds the Tie+Ta$s*
)) )) )) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Peer &estart
3nd+oint*A 3nd+oint*H
@6!1BA
@6!PBH
@6!1BH
@6!PBA
C&ASD
INI6 J6gBHCL
INI6*AC4J6gBAC, Coo:ieJ66JAyEHyLLL
C224I3*3C72J6gBACMCoo:ieL
3S6A81IS73"
C224I3*AC4
@6!1BHC
@6!PBAC
@6!1BHC
@6!1BACI
@6!PBHC
I A++ is given 0estrt notiAiction
100 100 100 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
&estart: 'inal Processin$
#
@ndpoint+A will unpack and verif# the state cookie*
As part of validation it will use the Tie+Ta$s to
determine that a peer restart as occurred*
#
It will repl# with a COOFI@+ACF to the restarted peer
%@ndpoint+T)*
#
It will also notif# its upper la#er or application that a
peer restart has occured*
#
8ote that the SCTP stack on @ndpoint+T is never
aware that a restart of the association has occurred*
101 101 101 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
3uestions
#
3uestions
102 102 102 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
.ata Transfer 7asics
#
e now shift our attention to normal data transfer*
#
.ata transfer happens in the @STA7JISD@.-
SD/T.O8+P@8.I8"- SD/T.O8+S@8T and
SD/T.O8+&@C@IA@. states*
#
8ote that even thou$h the COOFI@+@CDO and
COOFI@+ACF can optionall# bundle .ATA- we are in
the @STA7JISD@. state b# the time the .ATA is
processed*
103 103 103 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
7#te+stream vs* 0essa$es
#
hen data is transferred in TCP- the user $ets a
stream of b#tes %not to be confused with SCTP
streams)*
#
/sers must GframeH their own messa$es if the# are
not transferin$ a stream of b#tes %ftp mi$ht be
considered an application that sends a stream of
b#tes)*
#
An SCTP user will send and receive messa$es* All
messa$e boundaries are preserved*
#
A user will alwa#s read either AJJ of a messa$e or
in some cases part of a messa$e*
10$ 10$ 10$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
&eceivin$ and Sendin$ 0essa$es
#
A user will 8@A@& see two different messa$es in a
buffer returned from a sin$le rcvms$%) call
#
An SCTP user will pass a messa$e to the sndms$%)
or sctpUsndms$%) function call for sendin$ %more on
these two calls later)
#
The user messa$e will then take one of two paths
throu$h the SCTP stack:
'ra$mentation RorR
Sin$leton
10% 10% 10% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP .ata Chunk SiNe
#
In the case of a sin$leton- the messa$e must fit
entirel# in one SCTP chunk*
#
The maximum chunk siNe SCTP uses is usuall#
dictated b# the smallest 0T/ of all of the peerQs
destination addresses*
#
&ecall that P0T/ discover# is part of &'C5><6 and
must be implemented*
10& 10& 10& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Addin$ the Deaders
#
A .ATA chunk header is prefixed to the user
messa$e*
#
TS8- Stream Identifier- and Stream Se,uence
8umber %if ordered) are assi$ned to each .ATA
chunk*
#
The .ATA chunk is then ,ueued for bundlin$ into an
SCTP packet*
#
An SCTP packet is a common header plus a
collection of chunks %both control and .ATA)
10' 10' 10' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
An SCTP Packet
SC6P Common 7eder
Chun: 1
Chun: N
...
10( 10( 10( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
hat To .o hen It on(t All 'itM
#
The process of splittin$ messa$es up into multiple
parts is called fra$mentation*
#
A messa$e that cannot fit into a sin$le chunk is
chopped up into multiple parts*
#
All parts of the same messa$e use the same Stream
Identifier %SI.) and Stream Se,uence 8umber %SS8)*
#
@ach part will use a uni,ue TS8 %in consecutive
order) and appropriate fla$ bits to indicate if it is a
first- last- or middle piece of a messa$e*
10) 10) 10) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Jar$e 0essa$e Transfer
3nd+oint A 3nd+oint H
3(00
octets
P<6;B%12 octets
SC6P SC6P
6SN 1I
I * 8 bit set to 1
110 110 110 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Jar$e 0essa$e Transfer
3nd+oint A 3nd+oint H
P<6;B%12 octets
SC6P SC6P
6SN 2 6SN 1I
I * 8 bit set to 1
111 111 111 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Jar$e 0essa$e Transfer
3nd+oint A 3nd+oint H
P<6;B%12 octets
SC6P SC6P
6SN 3 6SN 1I 6SN 2
I * 8 bit set to 1
112 112 112 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Jar$e 0essa$e Transfer
3nd+oint A 3nd+oint H
P<6;B%12 octets
SC6P SC6P
6SN $ 6SN 1I 6SN 2 6SN 3
I * 8 bit set to 1
113 113 113 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Jar$e 0essa$e Transfer
3nd+oint A 3nd+oint H
P<6;B%12 octets
SC6P SC6P
6SN %
6SN 1I
6SN 2 6SN 3 6SN $
I * 8 bit set to 1
11$ 11$ 11$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Jar$e 0essa$e Transfer
3nd+oint A 3nd+oint H
P<6;B%12 octets
SC6P SC6P
6SN &
6SN 1I
6SN 2
6SN 3 6SN $ 6SN %
I * 8 bit set to 1
11% 11% 11% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Jar$e 0essa$e Transfer
3nd+oint A 3nd+oint H
P<6;B%12 octets
SC6P SC6P
6SN '
6SN 1I
6SN 2
6SN 3
6SN $ 6SN % 6SN &
I * 8 bit set to 1
11& 11& 11& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Jar$e 0essa$e Transfer
3nd+oint A
3nd+oint H
P<6;B%12 octets
SC6P SC6P
6SN (
6SN 1I
6SN 2
6SN 3
6SN $
6SN % 6SN & 6SN '
I * 8 bit set to 1
11' 11' 11' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Jar$e 0essa$e Transfer
3nd+oint A
3nd+oint H
P<6;B%12 octets
SC6P SC6P
6SN )M
6SN 1I
6SN 2
6SN 3
6SN $
6SN %
6SN & 6SN ' 6SN (
I * 8 bit set to 1
M * 3 bit set to 1
11( 11( 11( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Jar$e 0essa$e Transfer
3nd+oint A
3nd+oint H
P<6;B%12 octets
SC6P SC6P
6SN )M
6SN 1I
6SN 2
6SN 3
6SN $
6SN %
6SN &
6SN ' 6SN (
I * 8 bit set to 1
M * 3 bit set to 1
11) 11) 11) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Jar$e 0essa$e Transfer
3nd+oint A
3nd+oint H
P<6;B%12 octets
SC6P SC6P
6SN )M
6SN 1I
6SN 2
6SN 3
6SN $
6SN %
6SN &
6SN '
6SN (
I * 8 bit set to 1
M * 3 bit set to 1
120 120 120 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Jar$e 0essa$e Transfer
3nd+oint A
3nd+oint H
P<6;B%12 octets
SC6P SC6P
6SN )M
6SN 1I
6SN 2
6SN 3
6SN $
6SN %
6SN &
6SN '
6SN (
I * 8 bit set to 1
M * 3 bit set to 1
121 121 121 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Jar$e 0essa$e Transfer
3nd+oint A
3nd+oint H
3(00
octets
P<6;B%12 octets
SC6P SC6P
122 122 122 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
.ata &eception
#
hen a SCTP packet arrives all control chunks are
processed first*
#
.ata chunks have their chunk headers detached and
the user messa$e is made available to the
application*
#
Out+of+order messa$es within a stream will be held
for stream se,uence re+orderin$*
#
If a fra$mented messa$e is received it is held until
all pieces of it are received*
123 123 123 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0ore on .ata &eception
#
All pieces are received when the receiver has a chunk
with the first %7) bit set- the last %@) bit set- and all
intervenin$ TS8(s between these two chunks*
#
The data is reassembled into a user messa$e usin$
the TS8 to order the middle pieces from lowest to
hi$hest*
#
After reassembl#- the messa$e is made available to
the upper la#er %within orderin$ constraints)*
12$ 12$ 12$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Transmission &ules
#
As in TCP- a con$estion window %cwnd) and receive
window %rwnd) are used to control sendin$ of user
data*
#
A sender must not transmit more than the calculated
cwnd on a destination address*
#
The sender also must not attempt to send more than
the peerQs rwnd to the peer*
12% 12% 12% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0ore on Transmission
#
Dowever- if the peer closes its rwnd to 6 and the
sender has no data chunks in fli$ht- it ma# alwa#s
send one packet with data to probe for a chan$e in
the rwnd*
12& 12& 12& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Selective Acknowled$ment
#
.ata is acknowled$ed via a dela#ed SACF scheme
similar to TCP*
#
A SACF chunk includes the cumulative ack point
%cum+ack) point*
#
cum+ack is the hi$hest se,uential TS8 that has been
received*
#
Out+of+order se$ments received are reported with
G$ap ack blocksH in the SACF
12' 12' 12' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0ore on SACF
#
e alwa#s attempt to send a SACF back towards
the destination address where the .ATA came from*
#
ith the cum+ack point and $ap ack blocks- a SACF
chunk full# describes all TS8(s received within
P0T/ constraints:
'or a 9B66 b#te ethernet frame- this means that over ?<6
$ap blocks can be included in addition to the fixed fields of
a SACF chunk*
#
A SACF ma# also contain indications of duplicate
data reception*
12( 12( 12( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0ore on SACFs
#
A receiver is allowed to revoke an# data previousl#
acknowled$ed in $ap ack blocks*
@xample: receiverQs reassembl# buffer is memor# limited
#
This means that a sender must hold a TS8 until after
the cum+ack has reached it*
12) 12) 12) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
&etransmission Timer
#
SCTP maintains a &ound Trip Time %&TT) and a
&etransmission Time Out %&TO)*
#
0ost SCTP implementations will use an inte$er
approximation for the &TT formula created b# Aan
Vacobson for TCP i*e* SCTP and TCP use a similar
formula but in practice ever#one uses the same
exact math for both TCP and SCTP*
130 130 130 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0ore on &etransmission
#
hile sendin$ data- a endpoint tr#(s to measure the
&TT once ever# round trip*
#
e do 8OT measure the &TT of an# packet that is
retransmitted %since upon acknowled$ment we don(t
know which transmission the SACF $oes with)*
#
Since SCTP is a multi+homed protocol- there is a
small complication of how the T?+rtx timer is
mana$ed*
131 131 131 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
@ven 0ore on &etransmission Timer
#
A $eneral rule of thumb is that for an# destination
that has outstandin$ data %unacknowled$ed data) a
retransmission timer should be runnin$*
#
hen all data that was in+fli$ht to a destination is
acknowled$ed- the timer should be stopped*
#
A peer revokin$ acknowled$ement ma# also cause a
sender to restart a T?+rtx*
#
hen startin$ the T? timer- we alwa#s use the &TO
value not the &TT*
132 132 132 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Other &etransmissions
#
Jike TCP- SCTP uses 'ast &etransmit %'&) to
expedite retransmission without alwa#s re,uirin$ a
T?+rtx timeout*
#
The SCTP sender keeps track of the GholesH that $ap
ack blocks report are missin$ b# maintainin$ a
strike count for those chunks*
#
hen the strike count reaches four- the .ATA chunk
is retransmitted*
133 133 133 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0ore on 'ast &etransmit
#
hen a '& occurs- a cwnd adjustment is made- but
not as drastic as a T?+rtx timeout* 1more on this later2
#
Onl# one adjustment is made per fli$ht of data so
that multiple '&(s in the same window do 8OT cut
the cwnd more than once %note the I+" has more
details on this procedure)*
#
This sin$le reduction is sometimes referred to as
G8ew&enoH* 8ew&eno is named after the version of
TCP that it ori$inated in*
13$ 13$ 13$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
hat Dappens at Timer @xpiration
#
A cwnd adjustment is made 1more on this later2
#
The &TO is doubled*
#
All outstandin$ data to that destination is marked for
retransmission*
#
If the receiver is multi+homed- an alternate address
is selected for the data chunks that were in+fli$ht*
#
&etransmit up to one 0T/(s worth of data towards
the peer*
13% 13% 13% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0ulti+homed Considerations
#
hen a peer is multi+homed- a Gprimar# destination
addressH will be selected b# the SCTP endpoint*
#
7# default- all data will be sent to this primar#
address*
#
hen the primar# address fails- the sender will
select an alternate primar# address until it is
restored or the user chan$es the primar# address*
#
SACF(s ma# also re,uire some special handlin$-
consider the followin$:
13& 13& 13& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A 0ulti+homed Peer ith a 'ailure
3P*1 3P*2
IP Net=or:
IP*3
IP*2
IP*1
IP*$
4
13' 13' 13' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Special Considerations
#
If IP+5 was @P+5(s primar# address- then the
association ma# still fail even thou$h @P+9 has
multiple addresses* 1more on association failures later2
#
In the precedin$ drawin$ ima$ine that @P+9 is
sendin$ packets with source address IP+5*
#
If @P+5 alwa#s sends SACFQs back to IP+5- @P+9 will
never receive a SACF*
#
To prevent this- a receiver will $enerall# alter the
destination address of a SACF if it receives
duplicate data*
13( 13( 13( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Streams and Orderin$
#
A sender tells the sndms$%) or sctpUsndms$%)
function which stream to send data on*
#
7oth ordered and un+ordered data can be sent
within a stream*
'or un+ordered data- deliver# to the upper la#er is
immediate upon receipt*
'or ordered data- deliver# ma# be dela#ed due to
reassembl# from network reorderin$*
13) 13) 13) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0ore on Streams
#
A stream is uni+directional
SCTP makes 8O correlation between an inbound and
outbound stream
#
An association ma# have more streams travelin$ in
one direction than the other*
Aalid stream number ran$es for each direction are set
durin$ association setup
#
"enerall# an application will want to tie two streams
to$ether*
1$0 1$0 1$0 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Stream 3ueues
#
/suall#- each side of an association maintains a
send ,ueue per stream and a receive ,ueue per
stream for reorderin$ purposes*
#
Stream Se,uence 8umbers %SS8) are used for
reorderin$ messa$es in each stream*
#
TS8Qs are used for retransmittin$ lost .ATA chunks*
1$1 1$1 1$1 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP Streams
Sd*/ueue
0o*/ueue
0o*/ueue
Sd*/ueue
1$2 1$2 1$2 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Pa#load Protocol Identifier
#
@ach .ATA chunk also includes a Pa#load Protocol
Identifier %PPI.)*
#
This PPI. is used b# the application and network
monitorin$ e,uipment to understand the t#pe of
data bein$ transmitted*
#
SCTP pa#s no attention to this field %itQs opa,ue)*
1$3 1$3 1$3 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Partial .eliver#
#
8ormall#- a user $ets an entire messa$e when it
reads from its socket* The Partial .eliver# API
provides an exception to this*
#
The P.+API is invoked when a messa$e is lar$e in
siNe and the SCTP stack needs to be$in deliver# of
the messa$e to help free some of the resources held
b# it durin$ re+assembl#*
#
The pieces are alwa#s delivered in order*
#
The API provides a G#ou have moreH indication*
1$$ 1$$ 1$$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Partial .eliver# II
#
The application must continue to read until this
indication clears and assemble the lar$e messa$e*
#
At no time- once the P.+API is invoked- will the
application receive an# other messa$e %even if full#
received b# SCTP) until the entire P.+API messa$e
has been read*
#
8ormall# the P.+API is not invoked unless the
messa$e is ver# lar$e %usuall# W or more of the
receive buffer)*
1$% 1$% 1$% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
@rror Protection &evisited
#
SCTP was ori$inall# defined with the Adler+?5
checksum*
#
This checksum was eas# to calculate but was shown
to be weak and in+effective for small messa$es*
#
After 0/CD debate the checksum was chan$ed to
C&C?5c %the same one used b# iSCSI) in &'C??6>*
#
This provides 0/CD stron$er data inte$rit# than /.P
or TCP but does run an additional cost in
computation*
1$& 1$& 1$& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0ore @rrors
#
If a endpoint receives a packet with a bad
checksum- the packet is silentl# discarded*
#
Other t#pes of errors ma# also occur- such as the
sender usin$ a stream number that was not
ne$otiated up front %i*e* out of ran$e):
In this case- a @&&O& report would be sent back to the
peer- but the TS8 would be acknowled$ed*
#
If a empt# .ATA chunk is received %i*e* no user data)
the association will be A7O&T@.*
1$' 1$' 1$' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
3uestoionsMM
#
3uestions
1$( 1$( 1$( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Con$estion Control %CC)
#
e will now $o into con$estion control %CC)
'or some of #ou who have worked in transport- this will be
somewhat repeatitive %sorr#)*
#
CC ori$inall# did not exist in TCP* This caused a
series of con$estion collapses in the late ;6(s*
#
Con$estion collapse is when the network is passin$
lots of data but almost AJJ of that data is
retransmissions of data that has alread# arrived at
the peer*
&'C;>< provides lots of details for those interested in
con$estion collapse
1$) 1$) 1$) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Con$estion Control II
#
In order to avoid con$estion collapse- CC was added
to TCP* An Additive Increase 0ultiplicative .ecrease
%AI0.) function is used to adjust sendin$ rate*
#
The basic idea is to slowl# increase the amount an
endpoint is allowed to send %cwnd)- but collapse cwnd
rapidl# when there is si$n of con$estion*
#
Packet loss is assumed to be the primar# indicator
and result of con$estion*
1%0 1%0 1%0 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Con$estion Control Aariables
#
Jike TCP- SCTP uses AI0.- but there are differences
thou$h in how it all works %compared to TCP)*
#
SCTP uses four control variables per destination
address:
cwnd R con$estion window- or how much a sender is
allowed to send towards a specific destination
ssthresh R slow start threshold- or where we cut over from
Slow Start to Con$estion Avoidance %CA)
1%1 1%1 1%1 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Con$estion Control Aariables II
fli$htsiNe R or how much data is unacknowled$ed and thus
Gin+fli$htH* 8ote that in &'C5><6 the term fli$htsiNe is
avoided- since it does not reall# have to be coded as a
variable %an implementation ma# re+count fli$htsiNe as
needed)*
pba R partial b#tes acknowled$ed* This is a new control
variable that helps determine when a cwnd(s worth of data
has been sent and acknowled$ed while in CA
#
e will $o throu$h the use of these variables in a
example- so don(t panicE
1%2 1%2 1%2 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Con$estion Control: InitialiNation
#
Initiall# a new destination address starts with a
initial cwnd of two 0T/(s* Dowever- the latest I+"
chan$es this to min1= 0T/(s- =?;6 b#tes2*
#
ssthresh is set theoreticall# infinit#- but it is usuall#
set to the peerQs rwnd*
#
fli$htsiNe and pba are set to Nero*
#
Slow Start %SS) is used when cwnd XI ssthresh*
8ote that initiall# we are in Slow Start %SS)*
1%3 1%3 1%3 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Con$estion Control: Sendin$ .ata
#
As lon$ as there is room in the cwnd- the sender is
allowed to send additional data into the network*
There is room in the cwnd as lon$ as fli$htsiNe X cwnd*
#
This is sli$htl# different then TCP in that SCTP can
GslopH over the cwnd value* If the fli$htsiNe is %cwnd+
9)- another packet can be sent*
#
@ver# time a SACF arrives- one of two al$orithms-
Slow Start %SS) or Con$estion Avoidance %CA)- is
used to increment the cwnd*
1%$ 1%$ 1%$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Controllin$ cwnd "rowth
#
hen a SACF arrives in SS- we increment the cwnd
b# the either the number of b#tes acknowled$ed or
one 0T/- whichever is less*
Slow Start is used when cwnd XI ssthresh
#
hen a SACF arrives in CA- we increment pba b#
the number of b#tes acknowled$ed* hen pba Y
cwnd increment the cwnd b# one 0T/ and reduce
pba b# the cwnd*
Con$estion Avoidance is used when cwnd Y ssthresh
1%% 1%% 1%% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Con$estion Control
#
pba is reset to Nero when all data is acknowle$ed
#
e 8@A@& advance cwnd if the cumulative
acknowled$ment point is not movin$ forward*
#
A 0ax 7urst Jimit is alwa#s applied to how man#
packets ma# be sent at an# opportunit# to send
This limit is usuall# =
An opportunit# to send is an# event that will cause data
transmission %SACF arrival- user sendin$ of data- etc*)
1%& 1%& 1%& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Con$estion Control @xample
3P*H 3P*A
9
5
?
=
"A6AJ1$%2L
"A6AJ1$%2L
"A6AJ10)&L
"A6AJ1$%2L
"A6AJ%$(L
1%' 1%' 1%' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Con$estion Control @xample II
#
In our example- at point 9 we are at the initial sta$e-
cwndI?666- ssthresh I infinit#- pbaI6- fli$htsiNeI6*
Our application sends =666 b#tes*
#
The implementation sends these %note there is no
block b# cwnd)*
#
At point 5- the SACF arrives and we are in SS* The
cwnd is incremented to =B66 b#tes- i*e: add
min%9B66- 5>6=)*
1%( 1%( 1%( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Con$estion Control @xample III
#
At point ?- the SACF arrives for the last data
se$ment- but no cwnd advance is made- wh#M
#
Our application now sends 5666 b#tes* These can be
sent since fli$htsiNe is 6- cwnd is =B66*
#
At point =- no con$estion control advancement is
made*
#
So we end with fli$htsiNeI6- pbaI6- cwndI=B66- and
ssthresh still infinit#*
1%) 1%) 1%) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
&educin$ cwnd and Adjustin$ ssthresh
#
The cwnd is lowered on two events- all re$ardin$ a
retransmission event*
#
/pon a T?+rtx timeout- set ssthresh to W the value of
cwnd or 5 0T/ whichever is more* Then set cwnd to
9 0T/*
#
/pon a 'ast &etransmit %'&)- set ssthresh a$ain to
W the cwnd or 5 0T/ whichever is more* Then set
cwnd to the value of ssthresh*
1&0 1&0 1&0 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Con$estion Control
#
8ote this means that if we were in CA- we move
back to SS for either '& or T?+rtx adjustments to
cwnd*
#
So how do we tell if we are in CA or SSM
An# time the cwnd is lar$er than the ssthresh we perform
the CA al$orithm* Otherwise we are in SS*
1&1 1&1 1&1 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Path 0T/ .iscover#
#
P0T/ .iscover# is GbuiltH into the SCTP protocol*
#
A SCTP sender alwa#s sets the .' bit in IPv=*
#
hen a packet with .' bit set will not GfitH- then an
IC0P messa$e is returned b# the trust# router*
#
This messa$e is used to reset the P0T/ and
possibl# the smallest 0T/*
#
8ote that this ma# also mean re+chunkin$ ma# occur
as well %in some situations)*
1&2 1&2 1&2 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
3uestions
#
3uestionsM
1&3 1&3 1&3 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
'ailure .etection and &ecover#
#
SCTP has two methods of detectin$ fault:
Deartbeats
.ata retransmission thresholds
#
Two t#pes of faults can be discovered:
An unreachable address
An unreachable peer
#
A destination address ma# be unreachable due to
either a hardware or network failure
1&$ 1&$ 1&$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
/nreachable .estination Address
NI*1 NI*2
3nd+oint*1
NI*1 NI*2
3nd+oint*2
IP Net=or:
IP Net=or:
4
1&% 1&% 1&% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
/nreachable Peer 'ailure
#
A peer ma# be unreachable due to either:
A complete network failure
Or- more likel#- a peer software or machine failure
#
To an SCTP endpoint- both cases appear to be the
same failure event %network failure or machine
failure)*
#
In cases of a software failure if the peers SCTP stack
is still alive the association will be shutdown either
$racefull# or with an A7O&T messa$e*
1&& 1&& 1&& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
/nreachable Peer: 8etwork 'ailure
NI*1 NI*2
3nd+oint*1
NI*1 NI*2
3nd+oint*2
IP Net=or:
IP Net=or:
4
4
1&' 1&' 1&' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
/nreachable Peer: @ndpoint 'ailure
NI*1 NI*2
3nd+oint*1
NI*1 NI*2
3nd+oint*2
IP Net=or:
IP Net=or:
1&( 1&( 1&( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Deartbeat 0onitorin$ 0echanism
#
A D@A&T7@AT is sent to an# destination address
that has been idle for lon$er than the heartbeat
period
#
A destination address is idle if no chunks that can
be used for &TT updates have been sent to it
e*$* usuall# .ATA and D@A&T7@AT
#
The heartbeat period timer is reset an# time a .ATA
or D@A&T7@AT are sent
#
The peer responds with a D@A&T7@AT+ACF
1&) 1&) 1&) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
/nreachable .estination .etection
#
@ach time a D@A&T7@AT is sent- a .estination @rror
count for that destination is incremented*
#
An# time a D@A&T7@AT+ACF is received- the @rror
count is cleared*
#
An# time .ATA is acknowled$ed that was sent to a
destination- its @rror count is cleared*
#
An# time a .ATA T?+rtx timeout occurs on a
destination- the @rror count is incremented*
#
An# time the .estination @rror count exceeds a
threshold %usuall# B)- the destination is declared
unreachable*
1'0 1'0 1'0 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
/nreachable .estination II
#
If a primar# destination is marked GunreachableH- an
alternate is chosen %if available)*
#
Deartbeats will continue to be sent to GunreachableH
addresses*
#
If a Deartbeat is ever answered- the @rror count is
cleared and the destination is marked GreachableH*
If it was the primar# destination and no user intervention
has occurred- it is restored as the primar# destination*
1'1 1'1 1'1 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
/nreachable Peer I
#
In addition to the .estination @rror count- an overall
Association @rror count is also maintained*
#
@ach time a .estination @rror count is incremented-
so is the Association @rror count*
#
@ach time a .estination @rror count is cleared- so is
the Association @rror count*
#
If the Association @rror count exceeds a threshold
%usuall# ;)- the peer is marked as unreachable and
the association is torn down*
1'2 1'2 1'2 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
/nreachable Peer II
#
8ote that the two control variables are seperate and
unrelated %i*e* .estination @rror threshold and the
Association @rror threshold)*
#
It is possible that AJJ destinations are unreachable
and #et the Association @rror count has not
exceeded its threshold for association tear down*
#
This is what is known as bein$ in the .ormant State*
#
In this state- 0OST implementations will at least
continue to send to one address*
1'3 1'3 1'3 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Other /ses for Deartbeats
#
Deartbeat is also used to calculate &TT estimates
#
The standard Aan Vacobson S&TT calculation is
done on both .ATA &TTs or Deartbeat &TTs
#
Vust after association setup- Deartbeats will occur at
a faster rate to GconfirmH addresses
#
Address Confirmation is a new concept added in
Aersion 96 of the I+"
1'$ 1'$ 1'$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Address Confirmation
#
All addresses added to an association via I8IT or
I8IT+ACF(s address lists that were 8OT supplied b#
the user or used to exchan$e the I8IT and I8IT+ACF
are considered to be suspect*
#
These address are marked unconfirmed and
CA88OT be marked as the primar# address*
#
A Deartbeat with a <=+bit nonce must be sent and an
Deartbeat+Ack with the proper nonce returned
before an address can leave the unconfirmed state*
1'% 1'% 1'% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
h# Address Confirmation
3nd+oint*1
3nd+oint*2
IP Net=or: IP Net=or:
IP Net=or:
3vil*3
InitJIP*A,IP*8L
IP+A
IP+7
IP+T
IP+4
1'& 1'& 1'& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Deartbeat Controls
#
Deartbeats can be turned on and off*
#
Deartbeats have a default interval of ?6 seconds*
This can also be adjusted*
#
The @rror thresholds can be adjusted:
@ach .estination(s @rror threshold
Overall Association @rror threshold
#
Care must be taken in makin$ an# adjustments as
false failure detections ma# occur*
1'' 1'' 1'' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Deartbeat Controls II
#
All heartbeats have a random delta %jitter) added to
them to prevent s#nchroniNation*
#
The heartbeat interval will e,uate to
&TO Z D7*Interval Z %delta)*
#
The random delta is ZC+ 6*B6 of &TO*
#
/nanswered heartbeats cause &TO doublin$*
1'( 1'( 1'( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
8etwork .iversit# and 0ulti+homin$
#
0ulti+homin$ can assist $reatl# in preventin$ sin$le
points of failure
#
Path diversit# is also needed to prevent a sin$le
point of failure
#
Consider the followin$ two networks with maximum
path diversit# and minimal path diversit#:
7oth hosts are multi+homed- but which network is more
desirableM
1') 1') 1') 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0aximum Path .iversit#
3nd+oint*1
3nd+oint*2
1(0 1(0 1(0 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0inimum Path .iversit#
3nd+oint*1
3nd+oint*2
1(1 1(1 1(1 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
As#mmetric 0ulti+homin$
#
In some cases- one side will be multi+homed while
the other side is sin$l#+homed*
#
In this confi$uration- a sin$le failure on the multi+
homed side ma# still disable the association*
#
This failure ma# occur even when an alternate route
exists*
#
Consider the followin$ picture:
1(2 1(2 1(2 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A#smmetric 0ulti+Domin$
3nd+oint*1
3nd+oint*2
3.1 3.2
1.2
1.1
2.1
2.2
3*1 0oute 6ble 3*2 0oute 6ble
3.0 *N 1.2 1.0 *N 3.2
2.0 *N 3.2
1(3 1(3 1(3 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Solutions to the Problem
#
One possible solution is shown in the next slide*
#
One disadvanta$e is that an extra route must be
added to the network- thus usin$ additional address
space*
#
&outin$ setup is more complicated %most hosts like
to use simple default routes)
1($ 1($ 1($ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Solution 9
3nd+oint*1
3nd+oint*2
3.1E$.1 3.2
1.2
1.1
2.1
2.2
3*1 0oute 6ble 3*2 0oute 6ble
3.0 *N 1.2 1.0 *N 3.2
2.0 *N 3.2 $.0 *N 2.2
1(% 1(% 1(% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Simpler Solution
#
A simpler solution can be made b# the assitance of
the multi+homed hostQs routin$ table*
#
It first must be setup to allow duplicate routes at an#
level in its routin$ table*
#
Support must be added to ,uer# the routin$ table for
an GalternateH route*
#
hen SCTP hits a set error threshold- it asks for an
GalternateH route then the previousl# cached one *
1(& 1(& 1(& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Solution 5
3nd+oint*1
3nd+oint*2
3.1 3.2
1.2
1.1
2.1
2.2
3*1 0oute 6ble 3*2 0oute 6ble
"eAult *N 1.2 1.0 *N 3.2
2.0 *N 3.2 "eAult *N 2.2
1(' 1(' 1(' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Auxiliar# Packet Dandlin$
#
Sometimes- unexpected or GOut of the 7lueH %OOT7)
packets are received*
#
In $eneral- an OOT7 packet has 8O SCTP endpoint
to communicate with %note these rules are onl# for
SCTP protocol packets)*
#
hen an OOT7 packet is received- a specific set of
rules must be followed*
1(( 1(( 1(( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Auxiliar# Packet Dandlin$ II
#
9) If the address is non+unicast- the packet is silentl#
discarded*
#
5) If the packet holds an A7O&T chunk- the packet is
silentl# discarded*
#
?) If the OOT7 is an I8IT or COOFI@+@CDO- follow
the setup procedures*
#
=) If it is a SD/T.O8+ACF- send a SD/T.O8+
CO0PJ@T@ with the T bit set 1more details in next section2
1() 1() 1() 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Auxiliar# Packet Dandlin$ III
#
If the OOT7 is a SD/T.O8+CO0PJ@T@- silentl#
discard the packet*
#
If the OOT7 is a COOFI@+ACF or @&&O&- the packet
should be silentl# discarded*
#
'or all other cases- send back an A7O&T with the T
bit set*
hen the T bit is set- it indicates no TC7 and the A+Ta$ is
copied from the incomin$ packet to the outbound A7O&T*
1)0 1)0 1)0 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Aerification Ta$ &ules
#
All packets hold a A+Ta$ in the common header*
The A+Ta$ is a ?5 bit nonce that each side picks durin$
association setup %in the I8IT and I8IT+ACF chunks)
#
All packets received have the checksum calculated
and the A+Ta$ verified*
#
There is a set of rules for handlin$ A+Ta$s just like
there are for OOT7
1)1 1)1 1)1 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Two 7asic A+Ta$ &ules
#
hen the packet does 8OT contain an A7O&T- I8IT-
SD/T.O8+CO0PJ@T@- or COOFI@+@CDO- two
basic rules appl# for A+Ta$s appl#
&ule 9: hen sendin$ packets to a peer- the A+Ta$ is set to
the Initiate Ta$ the peer specified in the I8IT or I8IT+ACF
&ule 5: hen receivin$ an SCTP packet from a peer- the
receivin$ endpoint must validate that the A+Ta$ matches
the Initiate Ta$ it used in the I8IT or I8IT+ACF it sent*
1)2 1)2 1)2 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A+Ta$ &ules: I8IT
#
'or I8IT packets- the followin$ rules appl#:
&ule ?: The sender of an I8IT must set the A+Ta$ of the
packet to Nero*
&ule =: If the received packet has a A+Ta$ set to Nero- the
receiver must check for an I8IT*
If an I8IT is present- the standard setup rules for SCTP
are followed*
Otherwise- an A7O&T is sent with the T bit set*
1)3 1)3 1)3 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A+Ta$ &ules: A7O&T
#
'or packets carr#in$ an A7O&T- &ules B R : appl#:
&ule B: hen sendin$ an A7O&T- the sender should tr# to
populate the proper A+Ta$ in the common header- if known*
&ule <: If the A+Ta$ of the peer is not available- the sender
will set the T bit and use %cop#) the A+Ta$ from the received
packet that is causin$ the A7O&T
&ule :: hen an A7O&T chunk is present in a packet- it
must be accepted if the A+Ta$ matches the expected value
O& the T bit is set and the A+Ta$ matches the peerQs A+Ta$
%i*e* the A+Ta$ used for outbound packets)*
1)$ 1)$ 1)$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A+Ta$ &ules: Shutdown Chunks
#
'or packets carr#in$ a SD/T.O8+CO0PJ@T@:
&ule ;: If a SD/T.O8+ACF is received for an unknown
association- send a SD/T.O8+CO0PJ@T@ with the T bit
set and the use the A+Ta$ from the SD/T.O8+ACF*
#
hen a SD/T.O8+CO0PJ@T@ is received: If
the T bit is set- compare the received A+Ta$ with the peerQs
A+Ta$ to validate the SD/T.O8+CO0PJ@T@
Otherwise- compare with #our A+Ta$ to validate the
SD/T.O8+CO0PJ@T@
1)% 1)% 1)% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A+Ta$ &ules: COOFI@+@CDO
#
Packets carr#in$ a COOFI@+@CDO have special
handlin$- since the receiver $enerall# has 8O TC7:
&ule >: hen sendin$ a COOFI@+@CDO the A+Ta$ used will
be the Initiate Ta$ inside the I8IT+ACF*
&ule 96: 7efore comparin$ A+Ta$s- the rules for handlin$
state cookies must be executed first* Then- the A+Ta$ ma#
be verified*
Some implementations do not bother to check the A+Ta$
when the state cookie(s 0AC has much stron$er
protection then the A+Ta$
1)& 1)& 1)& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
7reak
#
3uestions
1)' 1)' 1)' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Dow .o e Close an Association
#
Jike TCP- SCTP uses a ?+wa# handshake when
closin$ an association
#
/nlike TCP- SCTP does 8OT support a half+closed
state
#
This means that once either endpoint closes an
association- both sides are forced to close the
association- sendin$ appropriate notifications to the
upper la#er*
1)( 1)( 1)( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
The Shutdown Dandshake
S7;6"25N
S
7
;
6
"
2
5
N
*A
C
4
S7;6"25N*C2<P1363
1
2
%
3
$
&
3nd+oint*A 3nd+oint*H
closeJL
1)) 1)) 1)) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
/ser Closes
#
There are six si$nificant points in our shutdown
handshake scenario*
#
Point 9: the application issues a close%) at @ndpoint+
A* At this point- 8O 8@ data can be sent from
@ndpoint+A*
#
Point 5: the SCTP implementation at @ndpoint+A has
sent and received acknowled$ement for all ,ueued
data %before the close)* At this point- the endpoint
sends a SD/T.O8 chunk*
200 200 200 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
/ser Closes II
#
Point ?: @ndpoint+T receives the SD/T.O8 so the
upper la#er is notified and 8O 8@ data will be
accepted for transmission
#
Point =: @ndpoint+T has received acknowled$ment
for all its ,ueued data so it sends a SD/T.O8+
ACF
#
Point B: @ndpoint+A destro#s the associationCTC7
and sends back a SD/T.O8+CO0PJ@T@
#
Point <: @ndpoint+T receives the SD/T.O8+
CO0PJ@T@ and destro#s its associationCTC7
201 201 201 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
/ser Closes III
#
8ote that Points 9 and 5 ma# or ma# not be at the
same moment in time dependin$ on how much data
is en,ueued
#
8ote the same also holds true for Points ? and =*
202 202 202 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
The Shutdown: A Closer .etailed Jook
#
JetQs look from the perspective of the endpoint that
initiates the shutdown se,uence
#
An applicationCupper la#er initiates the shutdown
se,uence b# either:
closin$ the socket
makin$ an API call which invokes the (shutdown( re,uest
#
This puts the endpoint into one of two states:
SD/T.O8+P@8.I8" or SD/T.O8+S@8T
203 203 203 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Shutdown .etails II
#
Assume there is data en,ueue on both sides for this
discussion- and @ndpoint+A initiates the shutdown
#
The local @ndpoint+A then enters the SD/T.O8+
P@8.I8" state*
hile in this state and throu$h to completion of the
shutdown- the local endpoint will reject an# attempt to send
new data from the upper la#er
20$ 20$ 20$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Shutdown .etails III
#
@ndpoint+A continues with normal data transfer
sendin$ all ,ueued data to the peer endpoint
8ote the peer %@ndpoint+T) has no idea that @ndpoint+A is in
the SD/T.O8+P@8.I8" state
#
Once all data has been acknowled$ed- @ndpoint+A:
Starts a Shutdown Timer and a Shutdown+"uard Timer
Sends a SD/T.O8 to the peer
@nters the SD/T.O8+S@8T state
20% 20% 20% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Shutdown .etails IA
#
The peer upon receivin$ the SD/T.O8 enters the
SD/T.O8+&@C@IA@. state and informs its upper
la#er* 'rom here on- no new data will be accepted b#
the remote endpoint from its upper la#er*
#
hat happens if the SD/T.O8 $ets lostM The
Shutdown Timer will expire- causin$ a resend of the
SD/T.O8 chunk*
#
An# received .ATA will cause the Shutdown Timer
to restart %not the Shutdown+"uard Timer)*
20& 20& 20& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Shutdown .etails A
#
In fact ever# time a .ATA chunk arrives the
@ndpoint+A will answer with at minimum a
SD/T.O8 and possibl# a SD/T.O8 bundled
with a SACF*
#
8ote that the dela#ed SACF al$orithm is disabled
durin$ the SD/T.O8+S@8T state*
#
@ventuall# @ndpoint+T will de,ueue all of its data in
the SD/T.O8+&@CI@A@. state*
#
At that point it will send a SD/T.O8+ACF and
start a local Shutdown timer*
20' 20' 20' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Shutdown .etails AI
#
@ndpoint+T will resend the SD/T.O8+ACF until it
receives a SD/T.O8+CO0PJ@T@*
#
In both cases of Shutdown timer expiration for
@ndpoint+A or @ndpoint+T the error thresholds are
also incremented so there is a limit to the number of
SD/T.O8(s and SD/T.O8+ACF(s that will be
sent*
#
Once @ndpoint+A receives the SD/T.O8+ACF it
will stop its two timers and send back a
SD/T.O8+CO0PJ@T@*
20( 20( 20( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Shutdown .etails AII
#
After sendin$ the SD/T.O8+CO0PJ@T@ it will
destro# the local TC7*
#
So what does the second shutdown timer doM This
timer is known as the shutdown $uard timer %its not
in &'C5><6)* hat it does is provide an overall
$uard in case the peer is malicious and does not
stop sendin$ new data* If it expires the TC7 is
immediatel# destro#ed* 8ote this timer is usuall# set
to at least <6 seconds*
20) 20) 20) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Shutdown .etails 4III
#
So #ou ma# have noticed an issue- what happens if
@ndpoint+A $ets the SD/T.O8+ACF but the
SD/T.O8+CO0PJ@T@ is lostM
#
This is where the special rules discussed previousl#
come in*
#
@ndpoint+A would then receive a resend of the
SD/T.O8+ACF but it has no TC7*
#
So instead it send a SD/T.O8+CO0PJ@T@ with
the T bit set*
210 210 210 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Shutdown .etails I4
#
This allows @ndpoint+T to recover from lost
SD/T.O8+CO0PJ@T@s*
#
One other isolated set of events is also handled b#
special rules* If- after destro#in$ the TC7 @ndpoint+A
sends a I8IT at the same time the SD/T.O8+
CO0PJ@T@ is lost what happensM
#
The normal rules of sendin$ a T bit SD/T.O8+
CO0PJ@T@ still appl# but @ndpoint+T can also send
an @&&O& messa$e*
211 211 211 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A7O&T Chunk
#
One other state cleanup mechanism is the A7O&T
chunk*
#
hen an application issues a abortive close the TC7
is destro#ed immediatel#* In this case an A7O&T is
sent*
#
The A7O&T- in cases of application controlled abort-
contains the proper A+ta$ and would cause an
immediate destruction of the peers TC7 upon
receipt
#
The A7O&T chunk is not reliable however*
212 212 212 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A7O&T Chunk II
#
If an A7O&T is lost- the next packet sent to the
endpoint that destro#ed its TC7 will be treated as
OOT7*
#
The response would then be a A7O&T with the T bit
set* The A+Ta$ would be that of the incomin$ packet*
#
In cases of s#stem restart #ou would also receive an
A7O&T with T bit set in response to an# messa$e
%such as a Deartbeat)*
213 213 213 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
@xtensions to SCTP
#
SCTP- as we have seen- is ver# extensible*
#
To extend SCTP- both new chunk t#pes and
parameter t#pes can be added throu$h new &'C(s*
#
SCTP implementations use the upper bits to
determine how to handle unknown chunks and
parameters*
#
hen desi$nin$ extensions- one should take this
upper bit handlin$ into account
21$ 21$ 21$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
@xtensions II
#
IA8A assi$ns chunk and parameter values when a
8@ &'C $oes throu$h the I@T' standards process*
#
/suall#- the Internet .raft will contain a Gsu$$estedH
parameter or chunk value takin$ into account
current existin$ extension documents*
#
P&+SCTP as just advanced as the first extension to
&'C status R &'C ?:B;- others will follow the slow
standards process I am sure :+.
21% 21% 21% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
P&+SCTP I
#
Partial &eliabilit# SCTP allows a sender to GskipH
unacknowled$ed messa$es*
#
7oth endpoints must support the extension* A
parameter is passed durin$ setup to show that
support is present on each side of the association*
#
8ormall#- an application will put a Gtime limitH on the
life of an# $iven messa$e*
#
hen this time limit expires and the messa$e has
not been acknowled$ed- a Gskip messa$eH is sent*
21& 21& 21& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
P&+SCTP II
#
This Gskip messa$eH is called a 'O&A&.+TS8
%'.+TS8) chunk*
#
The '.+TS8 specifies the new cumulative TS8
point for the remote end*
#
It also specifies an# stream and se,uences that are
bein$ skipped b#*
#
The stream information aids a receivin$ endpoint in
findin$ held messa$es for reorderin$ on stream
,ueues*
21' 21' 21' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
P&+SCTP III
#
hen a '.+TS8 is received- the receiver must
update its cumulative ack point and respond with a
SACF*
#
The '.+TS8 mechanism is separated in the P&+
SCTP document from the decision process for
skippin$ a TS8*
#
The document details an extension of the lifetime
mechanism but other API interfaces are possible*
#
A receiver does not need to be aware of the sender
side polic# for skippin$ TS8(s*
21( 21( 21( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Other @xtensions
#
Two other extensions are under development as
well*
#
The A..+IP draft allows d#namic chan$es to an
address set of an endpoint without restart of the
association*
#
The A/TD draft allows selected chunks to be
GwrappedH with a si$nature* The draft is in
fluctuation ri$ht now but its final form will be an
implementation of the P7F+.raft %P7F stands for
Purpose 7uilt Fe#s)*
21) 21) 21) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
7reak
#
3uestionsM
220 220 220 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP and TCP: Similarities
#
7oth use a handshake to setup and terminate the
state %communication) relationship between peers
#
7oth have an abortive method to terminate the state
#
7oth provide a Greliable orderedH service:
Jost data is retransmitted
.ata is %or can be) delivered in the order it was sent
#
7oth follow an AI0.+based con$estion control
mechanism*
221 221 221 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP and TCP
#
SCTP uses a four+wa# handshake to setup an
association* TCP uses a three+wa# handshake to
setup a connection*
#
Dowever- this does not mean that data can start to
be sent more ,uickl# %relative to the start of the
connection) with TCP*
#
SCTP can exchan$e data on the third and fourth le$
of its handshake* TCP in practicalit# does not %due
to socket API issues)*
222 222 222 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP and TCP
#
SCTP delivers messa$es- not a Gb#te streamH
An application usin$ TCP must GframeH its own messa$es
#
SCTP streams allows Gpartiall# orderedH transfers
@scapes head of line blockin$- while preservin$ order
within each stream
#
An SCTP sender can send all messa$es in a sin$le
ordered stream to achieve the same behavior as
TCP*
223 223 223 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP and TCP
#
SCTP also provides an Greliable un+orderedH service
for applications
22$ 22$ 22$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP and TCP
#
TCP is a sin$l#+homed protocol so a sin$le interface
failure can shutdown a connection* SCTP is multi+
homed and can take advanta$e of all interfaces-
addresses on a host*
#
SACF support:
Optional in TCP- fundamental to SCTP
TCP SACF has a ver# limited se$ment space for specif#in$
out of order se$ments
SCTP has a much lar$er G$ap ackH space so that man# sets
of se$ments can be reported
22% 22% 22% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP and TCP
#
SCTP does not allow a half+closed state
Dalf+closed state is when one side is no lon$er allowed to
send data but the other side can*
#
SCTP does 8OT have a timed+wait state that will
hold a connection from bein$ made a$ain within a
specified time*
22& 22& 22& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP and TCP: Securit# Considerations
#
SCTP uses the four+wa# handshake and the si$ned
state cookie to protect a$ainst S!8 floodin$ attacks
#
SCTP uses a ?5+bit random nonce to protect its
packets from blind attackers
I+" version 96 prevents these from ever bein$ revealed
after association setup*
TCP does not have this and is more subject to various
forms of blind data and control se$ment injection attacks
as we have recentl# seen in the news
22' 22' 22' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
/sin$ Streams
#
Streams are a powerful mechanism that allows
multiple ordered flows of messa$es within a sin$le
association*
#
0essa$es are sent in their respective streams and if
a messa$e in one stream is lost- it will not hold up
deliver# of a messa$e in the other streams
#
The application specifies the stream number to send
a messa$e on usin$ its API interface
'or sockets- this is $enerall# sctpUsendms$%)
22( 22( 22( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0ore on Streams
#
An example of usin$ streams can be found in SS:
over IP %si$tran)* Dere various call messa$es will be
routed to different streams so that a lost messa$e
on one call will not hold up another call* /suall# the
SJS index of SS: is mapped onto a stream %SJS
values ran$e from 6 to 9B if I remember ri$ht :+.)
#
A web clientCserver could use streams to displa#
pictures in parallel instead of buildin$ multiple
connections*
22) 22) 22) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Stream @xample
S9
<9C
SS'*Net=or:
IA< S1SB2
230 230 230 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Stream @xample
S9
<9C
SS'*Net=or:
IA< S1SB(
IA< S1SB'
IA< S1SB2
SI"B2
231 231 231 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Stream @xample
S9
<9C
SS'*Net=or:
IA< S1SB2
IA< S1SB(
SI"B(
IA< S1SB'
SI"B'
232 232 232 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Stream @xample
S9
<9C
SS'*Net=or:
IA< S1SB'
IA< S1SB(
AC< S1SB2
SI"B2
233 233 233 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Sockets API
#
Chapter 99 of the SCTP book discusses the socket
API* This text is ,uite dated- but $ives the reader a
$eneral idea how the socket API works*
#
Dowever- a better reference for SCTP socket API
pro$rammin$ is the third revision of StevensQ /nix
8etwork Pro$rammin$*
#
This new book has three comprehensive up+to+date
chapters that detail the finer points of workin$ with
the SCTP socket API*
23$ 23$ 23$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP Socket T#pes
#
SCTP socket API comes in two forms: one+to+one
and one+to+man#*
#
The one+to+man# at one time was known b# the
G/.P st#leH socket* The one+to+one used to be
called the a GTCP st#leH socket*
#
So what is the purpose of each socket st#le and how
can it be usedM
#
e will start with the one+to+one st#le*
23% 23% 23% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
One+to+One st#le
#
The purpose of the one+to+one st#le socket is to
provide a smooth transition mechanism for those
applications runnin$ on TCP and wishin$ to move to
SCTP*
#
The same semantics used in TCP are used with this
st#le*
#
A server will t#picall# open the socket- make a call to
listen %to accept associations)- and call accept-
blockin$ upon the arrival of a new association*
23& 23& 23& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
One+to+One st#le
#
The onl# notable difference between a TCP socket
and a SCTP socket is the socket call uses
IPP&OTOUSCTP instead of IPP&OTOUTCP %or 6)*
#
Two other common socket options that mi$ht be
used in a TCP application have SCTP e,uivalents:
TCPU8O.@JA! +Y SCTPU8O.@JA!
TCPU0A4S@" +Y SCTPU0A4S@"
#
SCTP has a host of other socket options as well
which we will touch on further on*
23' 23' 23' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
One+to+One St#le
#
Switchin$ from TCP to SCTP becomes eas# with this
st#le of socket due to the few number of chan$es that
have to be made*
#
To $ive #ou an idea on this- note that I ported a
version of moNilla with onl# two lines of chan$e*
#
Of course a ,uick chan$e like I did in moNilla did not
$ain usea$e of SCTP streams but it does $ain #ou the
multi+homin$ aspects of SCTP*
#
Other cavets of movin$ a TCP application are that
there is 8O half+close state- so if an application
makes use of this- that code will need to be re+
written*
23( 23( 23( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
One+to+One St#le
#
Another thin$ that 0A! be an issue is that some TCP
applications will write a 5 or = b#te record len$th
followed b# that man# b#tes of data* If an application
behaves in this wa# SCTP will make each write a
sin$le messa$e*
#
These two messa$e would most likel# be bundled
to$ether but overall this increases the overhead on
the wire*
#
So what does a t#pical application usin$ the one+to+
one st#le socket look likeM
23) 23) 23) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
One+to+One @xample Server
int sd, newfd, sosz;
struct sockaddr_in6 sin6;
sosz = sizeof(sin6);
sd = socket(AF_INET6, SOCK_STREAM, IPPROTO_SCTP);
listen(sd, 1);
while (1) {
newfd = accept(sd, (struct sockaddr *)&sin6, &sosz)
do_child_stuff(newfd, &sin6, sosz);
}
2$0 2$0 2$0 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
One+to+0an# st#le
#
A t#pical server usin$ a one+to+man# st#le socket
will do a socket%) call- followed b# a listen%) and
recvfrom%)*
#
A t#pical client will just sendto%) the server of his
choice*
#
8ote that the connect%) and accept%) call are not
needed*
#
The connect%) call can be done b# either side %server
or client) but it is not needed*
2$1 2$1 2$1 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
One+to+0an# st#le
#
8ote that this st#le is more like what a /.P
clientCserver would look like thus the previous
name*
#
So what does a t#pical one+to+man# st#le server
look likeM
2$2 2$2 2$2 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
One+to+man# @xample Server
int sd, newfd, sosz, msg_flags;
struct sockaddr_in6 sin6;
struct sndrcvinfo snd_rcv;
char buf[8000];
sosz = sizeof(sin6);
sd = socket(AF_INET6, SOCK_SEQPKT, IPPROTO_SCTP);
listen(sd, 1);
while (1) {
len = sctp_recvmsg(sd, buf, sizeof(buf), (sockaddr *)&sin6, &sosz,
&snd_rcv, &msg_flags);
do_child_stuff(newfd, buf, len, &sin6, &snd_rcv, msg_flags);
}
2$3 2$3 2$3 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
One+to+man# .escription
#
8ote in the previous example we introduced the first
of several newCextra calls sctpUrecvms$%)*
#
This call is usuall# built as a librar# call %its not a
true s#stem call in most cases)*
#
It provides a convenince function that makes it eas#
to find out specific information about stream id and
other auxilar# information that SCTP can provide
upon receivin$ messa$es*
#
7ut before we $et in to the details of all the extra
calls we need to discuss notifications*
2$$ 2$$ 2$$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
SCTP 8otifications
#
The SCTP stack- at times- has information it ma#
wish to share with its application %or /pper Ja#er
Protocol *** /JP)*
#
The /JP can turn off and on specific notifications
via a socket options call*
#
7# default AJJ notifications are off*
#
So how does one $et a notificationM
#
7# readin$ data and lookin$ at the ms$Ufla$s- if the
messa$e read is a notification- then
[0S"U8OTI'ICATIO8H is contained within the
ms$Ufla$s ar$ument upon return*
2$% 2$% 2$% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0ore on 8otifcations
#
If the user does 8OT use the sctpUrecvms$%) call-
then #ou can also $ain access to this fla$ usin$ the
recvms$%) s#stem call and look at the ms$*ms$Ufla$s
field %most librar# calls implementin$ sctpUrecvms$%)
use recvms$%) and cop# the ms$*ms$Ufla$s into the
int\ passed to sctpUrecvms$%)*
#
So what do #ou $et when #ou read a notificationM
#
A union is read in that looks as follows:
2$& 2$& 2$& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
8otification /nion
/* notification event */
union sctp_notification {
struct sctp_tlv sn_header;
struct sctp_assoc_change sn_assoc_change;
struct sctp_paddr_change sn_paddr_change;
struct sctp_remote_error sn_remote_error;
struct sctp_send_failed sn_send_failed;
struct sctp_shutdown_event
sn_shutdown_event;
struct sctp_adaption_event sn_adaption_event;
struct sctp_pdapi_event sn_pdapi_event;
};
2$' 2$' 2$' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
.ecipherin$ 8otifications
#
@ver# 8otification uses a TJA format as illustrated
below:
#
So what t#pe of notifications do #ou $etM
struct sctp_tlv {
u_int16_t sn_type;
u_int16_t sn_flags;
u_int32_t
sn_length;
};
2$( 2$( 2$( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Association chan$e
#
SCTPUASSOCUCDA8"@ + indicates that a chan$e
has occurred in re$ard to an association %e*$*a new
association is now present on the socket or an
association has went awa#Cfailed)*
struct sctp_assoc_change {
u_int16_t sac_type;
u_int16_t sac_flags;
u_int32_t sac_length;
u_int16_t sac_state;
u_int16_t sac_error;
u_int16_t
sac_outound_streams;
u_int16_t sac_inound_streams;
sctp_assoc_t sac_assoc_id;
};
2$) 2$) 2$) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A Peer Address Chan$e event
#
An SCTPUP@@&UA..&UCDA8"@ will indicate that
somethin$ has occurred with the address %in+service-
out+of+service- added- deleted etc)*
/* !ddress events */
struct sctp_paddr_change {
u_int16_t spc_type;
u_int16_t spc_flags;
u_int32_t spc_length;
struct soc"addr_storage
spc_aaddr;
u_int32_t spc_state;
u_int32_t spc_error;
sctp_assoc_t spc_assoc_id;
};
2%0 2%0 2%0 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
A &emote @rror @vent
#
An SCTPU&@0OT@U@&&O& will communciate a
remote error sent b# the peer- this will be in the form
of a TJA and ma# indicate some internal stack
debu$$in$ information as to wh# an association was
closed*
/* remote error events */
struct sctp_remote_error {
u_int16_t sre_type;
u_int16_t sre_flags;
u_int32_t sre_length;
u_int16_t sre_error;
sctp_assoc_t
sre_assoc_id;
u_int#_t sre_data$%&;
};
2%1 2%1 2%1 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Send 'ailure
#
An SCTPUS@8.U'AIJ@. will indicate that data
,ueued was not acknowled$ed b# the peer and will
include the actual data that was attempted to be sent
%within some limits)* This ma# occur due to partial
reliablilit# or ri$ht before an association comes down*
/* data send failure event */
struct sctp_send_failed {
u_int16_t ssf_type;
u_int16_t ssf_flags;
u_int32_t ssf_length;
u_int32_t ssf_error;
struct sctp_sndrcvinfo
ssf_info;
sctp_assoc_t ssf_assoc_id;
u_int#_t ssf_data$%&;
};
2%2 2%2 2%2 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Shutdown @vent
#
An SCTPUSD/T.O8U@A@8T indicates that a
$raceful shutdown as occurred on an association*
/* shutdown event */
struct sctp_shutdown_event {
u_int16_t sse_type;
u_int16_t sse_flags;
u_int32_t sse_length;
sctp_assoc_t
sse_assoc_id;
};
2%3 2%3 2%3 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Adaption Ja#er @vent
#
An SCTPUA.APTIO8UI8.ICATIO8 is a part of the
add+ip extension and allows an upper la#er to
communicate an inte$er at startup informin$ the peer
what t#pe of /JP is bein$ operated %iSCSI- &.0A- M)
/* !daption layer indication stuff */
struct sctp_adaption_event {
u_int16_t sai_type;
u_int16_t sai_flags;
u_int32_t sai_length;
u_int32_t
sai_adaption_ind;
sctp_assoc_t sai_assoc_id;
};
2%$ 2%$ 2%$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Partial .eliver# @vent
#
An SCTPUPA&TIAJU.@JIA@&!U@A@8T will indicate
when somethin$ has went wron$ on a partial deliver#
that has been be$un %e*$* The association closed or
the messa$e was skipped via partial reliabilit#)*
/* pdapi indications */
struct sctp_pdapi_event {
u_int16_t pdapi_type;
u_int16_t pdapi_flags;
u_int32_t pdapi_length;
u_int32_t
pdapi_indication;
sctp_assoc_t pdapi_assoc_id;
};
2%% 2%% 2%% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Common to events is the assocUid
#
8ote that all events include somethin$ called an
assocUid*
#
This is a uni,ue identifier to the association*
#
0an# of the extended SCTP calls can use this for
sendin$ and or confi$urin$ an association with
socket options*
#
An application that wishes to use assocUid(s needs to
be aware of association id re+use and must pa# close
attention to failure and closin$ events*
2%& 2%& 2%& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
So how does one $et notificationsM
#
The socket option SCTPU@A@8TS is used to turn
onCoff all of the various events b# passin$ it the
followin$ structure:
/* 'n/'ff setup for suscription to events */
struct sctp_event_suscrie {
u_int#_t sctp_data_io_event;
u_int#_t sctp_association_event;
u_int#_t sctp_address_event;
u_int#_t sctp_send_failure_event;
u_int#_t sctp_peer_error_event;
u_int#_t sctp_shutdown_event;
u_int#_t
sctp_partial_delivery_event;
u_int#_t sctp_adaption_layer_event;
};
2%' 2%' 2%' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Subscribin$ Part II
#
Placin$ a (9( in the respective event t#pe field turns an
event on*
#
Placin$ a (6( in the respective event t#pe field turns an
event off*
#
8ote that these events are the standard ones so far-
other events ma# be added as various extensions
work their wa# throu$h the I@T'*
2%( 2%( 2%( 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
Socket Options
#
SCTP provides a host of socket options to perform a
mirad of operations*
#
Some have uni,ue structures others just turn thin$s
on and off with boolean(s or inte$ers*
#
SCTPU8O.@JA! R Turns onCoff the na$el al$orithm
%or other dela#) similar to TCP*
#
SCTPU0A4S@" R SetsC"ets a value for the SCTP
fra$mentation point %an inte$er is passed)* 8ote that
its possible that the value the s#stem uses is smaller
than what #ou set*
2%) 2%) 2%) 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0ore Socket Options
#
SCTPUASSOCI8'O R &etrieve or Set various
information about an association* 8ote that not all
fields in the structure are writeable*
#
SCTPUA/TOCJOS@ R Sets a idle time wherein an
association will automaticall# close* 'or one+to+man#
st#le servers this can be used so that no connection
state needs to be maintained b# the application*
#
SCTPUA.APTIO8UJA!@& R Set or "et the ?5 bit
adaption la#er indication that will be sent with I8IT(s
or I8IT+ACF(s*
2&0 2&0 2&0 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0ore Socket Options
#
SCTPU.@'A/JTUS@8.UPA&A0 R set or $et the
default sendin$ parameters %stream number- ppid
context and other fields in the sctpUsndrcvinfo
structure)*
#
SCTPU.ISA7J@U'&A"0@8TS R boolean that will
disable SCTP fra$mentation* 8ote that if
fra$mentation is disabled- sends lar$er than the
fra$ment point will be rejected with an error return
code*
#
SCTPU@A@8TS R we saw this one earlier- used to set
what notification events we wish to see*
2&1 2&1 2&1 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
0ore on Socket Options
#
SCTPU"@TUP@@&UA..&UI8'O R $et information on
a peers address* The information returned includes
the cwnd- srtt- rto and path mtu*
#
SCTPUIUA8TU0APP@.UA=UA..& R this boolean is
normall# on b# default and makes it so an Ipv< socket
will map A= address to A<* If this is turned off then A=
addresses will be received up a A< socket*
#
SCTPUI8IT0S" R Can be used to $et or set the
default I8ITCI8IT+ACF settin$s such as number of
streams allowed in or re,uested out*
2&2 2&2 2&2 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
@ven 0ore on socket options
#
SCTPUP@@&UA..&UPA&A0S R allows an endpoint to
$et or set the heart beat interval andCor path
maximum retransmist on a specific peer address*
#
SCTPUP&I0A&!UA..& R Allows an application to
specif# a peers address has the Gprimar#H address*
#
SCTPU&TOI8'O R $et or set the &TO information
&TO*min- &TO*max and &TO*initial*
#
SCTPUS@TUP@@&UP&I0A&!UA..& R Allows an
endpoint to re,uest that the peer chan$e its primar#
address to the one specified %note this will onl#
suceed if the peer supports the A..+IP extension)*
2&3 2&3 2&3 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
'inal socket option pa$e
#
SCTPUSTAT/S R allows an application to retrieve a
number of various parameters and stats with respect
to a specific association*
#
As #ou can see there are a JOT of options* If #ou will
there is a knob for about most thin$s someone would
want to do to a transport connection*
#
The purpose of all of these knobs is to $ive the
application better control of the transport*
#
If #ou plan on usin$ an# of these options I would
hi$hl# recommend $ettin$ the /8P ?
rd
edtion* This
$ives all the details #ou will need to use these
effectivel# %with examples)*
2&$ 2&$ 2&$ 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
@xtended Gs#stem callsH*
#
sctpUconnectx R Allows a user to specif# multiple
address to attempt to connect too*
#
sctpUbindx R Allows an application to bind a set of
addresses instead of one or all addresses*
#
sctpUoptUinfo R Some implementations do not
support a $etsockopt%) call that allows data to be
passed both wa#s %some of the calls need an
association id to $et information)* /se this call to be
compatible with all implementations*
2&% 2&% 2&% 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
@xtended Gs#stem callsH
#
sctpU$etpaddrs R This call will return a block of
memor# holdin$ the peers addresses currentl# part of
the association*
#
sctpUfreepaddrs R This call is used to release the
memor# back that the sctpU$etpaddrs call allocated*
#
sctpU$etladdrs R This call will return a block of
memor# holdin$ the local addresses bound to an
association*
#
sctpUfreeladdrs R This call should be used to release
the memor# allocated b# sctp+$etladdrs back to the
s#stem*
2&& 2&& 2&& 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
@xtended Gs#stem callsH
#
sctpUsendms$ R This call will allow the caller to
specif# on the command line thin$s like the stream
number and other SCTPish information to be sent
with a messa$e*
#
sctpUsend R This call has a similar purpose to
sctpUsendms$ but instead of a lar$e number of
command line options- a sctpUsendrcvinfo structure
is used to pass the relevant information*
#
sctpUrecvms$ R This call %as we saw previousl#) is
used to receive a messa$e but also a
sctpUsendrcvinfo structure with details on the
messa$e %e*$* The stream number and stream
se,uence number)*
2&' 2&' 2&' 2003 Cisco Systems, Inc. All rights reserved. Presenttion!I"
@xtended Gs#stem callsH
#
sctpUpeeloff R this call is used to convert a sin$le
association that is part of a one+to+man# socket into
an individual new socket descriptor that is a one+to+
one socket*
#
1Phil::: Should we $o throu$ht each of these and put
si$naturesMM2
#
1Dow do we endM A bi$ example2

You might also like