Supplement to

InfiniBand
TM
Architecture
Specification Volume
1.2.1
Copyright © 2003-2009, by InfiniBand
SM
Trade Association.
All rights reserved.
This document contains information proprietary to the InfiniBand
SM
Trade Association. Use or disclosure without
written permission by an officer of the InfiniBand
SM
Trade Association is prohibited.
March 2, 2009
Revision 1.0
Annex A14:
Extended Reli-
able Connected
(XRC) Transport
Service
InfiniBand
TM
Architecture Release 1.2.1 March 2, 2009
VOLUME 1 - GENERAL SPECIFICATIONS DRAFT
InfiniBand
SM
Trade Association Page 2 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
LEGAL DISCLAIMER “This specification provided “AS IS” and without any
warranty of any kind, including, without limitation,
any express or implied warranty of non-infringement,
merchantability or fitness for a particular purpose.
In no event shall IBTA or any member of IBTA be liable
for any direct, indirect, special, exemplary, punitive,
or consequential damages, including, without limita-
tion, lost profits, even if advised of the possibility of
such damages.”
Table 0 Revision History
Revision Date
1.0 3/1/2009 Draft for publication
InfiniBand
TM
Architecture Release 1.2.1 March 2, 2009
Volume 1 - General Specifications Rev 1.0
InfiniBand
SM
Trade Association Page 3 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
TABLE OF CONTENTS
Annex A14: Extended Reliable Connected (XRC) Transport Service ...........6
A14.1 Introduction ..............................................................................................6
A14.2 Glossary...................................................................................................8
A14.3 XRC Transport .......................................................................................10
A14.3.1 XRC OpCodes .......................................................................................... 10
A14.3.2 XRC Packet Format .................................................................................. 11
A14.3.3 Error Detection and Handling.................................................................... 15
A14.3.4 Header and Data Field Source.................................................................. 22
A14.4 XRC Software Transport Interface.........................................................28
A14.5 XRC Software Transport Verbs .............................................................30
A14.5.1 Verbs Overview......................................................................................... 30
A14.5.2 Transport Resource Management ............................................................ 30
A14.5.3 Work Request Processing......................................................................... 40
A14.5.4 Result Types ............................................................................................. 41
A14.6 Communication Management ................................................................41
A14.6.1 Extended Reliable Connected Service...................................................... 41
A14.6.2 Communication Management Messages.................................................. 41
A14.6.3 Message Field Details............................................................................... 42
A14.7 General Services ...................................................................................43
A14.7.1 ClassPortInfo............................................................................................. 43
InfiniBand
TM
Architecture Release 1.2.1 March 2, 2009
Volume 1 - General Specifications Rev 1.0
InfiniBand
SM
Trade Association Page 4 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
LIST OF FIGURES
Figure 1 Extended Reliable Connected (XRC) Model ....................................................6
Figure 2 XRC Extended Transport Header (XRCETH) ................................................11
Figure 3 XRC SEND Operation Example.....................................................................12
Figure 4 XRC RDMA WRITE Operation Example........................................................13
Figure 5 XRC RDMA READ Operation Example .........................................................14
Figure 6 XRC ATOMIC Operation Example.................................................................15
InfiniBand
TM
Architecture Release 1.2.1 March 2, 2009
Volume 1 - General Specifications Rev 1.0
InfiniBand
SM
Trade Association Page 5 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
LIST OF TABLES
Table 0 Revision History ................................................................................................2
Table 1 OpCode field ...................................................................................................10
Table 2 Requester Side Error Behavior .......................................................................16
Table 3 Responder Error Behavior Summary ..............................................................18
Table 4 Summary of XRC TGT QP Additional Responder Fault Class Behaviors.......20
Table 5 Packet Fields and Parameters by Service ......................................................22
Table 6 Connection Parameters by Transport Service.................................................25
Table 7 Packet Fields Validation source by Service.....................................................27
Table 8 Verb Classes ...................................................................................................30
Table 9 XRC Target QP State Transition Properties ....................................................36
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 6 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
ANNEX A14: EXTENDED RELIABLE CONNECTED (XRC) TRANSPORT SERVICE
A14.1 INTRODUCTION
This annex describes the Extended Reliable Connected Transport Ser-
vice (XRC) for InfiniBand. XRC allows significant savings in the number of
QPs required to establish all to all process connectivity in large clusters.
The established trend in multicore processors results in a direct increase
in the number of processes that typically run on each endnode of a typical
IB connected cluster. Multi core node systems are very common today
with roadmaps showing even more cores per node in the not so distant
future.
Figure 1 Extended Reliable Connected (XRC) Model
With the Reliable Connected (RC) Transport Service, the number of QPs
required per endnode to achieve full process to process connectivity is
equal to N*p*p (where N is the number of nodes in the cluster and p the
number of processes per node)
1
. As the number of processes grows to-
XRC
002
TGT QP
CQ
PD PD
for remote host 0
common
Process 0
XRC SRQ
XRC
012
TGT QP
XRC
200
INI QP
XRC
102
TGT QP
for remote host 1
Host 2 (3 processes)
XRC
201
INI QP
CQ
PD PD
Process 1
XRC SRQ
CQ
PD PD
Process 2
XRC SRQ
XRC
200
TGT QP
CQ
PD
for remote host 2
common
Process 0
XRC SRQ
XRC
210
TGT QP
XRC
001
INI QP
XRC
102
TGT QP
for remote host 1
Host 0 (2 processes)
XRC
002
INI QP
CQ
PD
Process 1
XRC SRQ
XRC
220
TGT QP
Host 1 (1 process)
XRC
011
INI QP
XRC
012
INI QP
XRC
210
INI QP
XRC
211
INI QP
XRC
220
INI QP
XRC
221
INI QP
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 7 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
gether with the number of cores per system, the number of RC QPs (and
its associated memory resources) start to become of significant impact.
XRC is a new approach in the spirit of the Reliable Datagram (RD) model
that reduces the number of QPs required for full connectivity in the sce-
nario above by a factor of p thus significantly improving the scalability of
the solution for large clusters of multicore endnodes.
XRC is different from RD in several ways but first and foremost it elimi-
nates the most significant limitation of the RD Transport Service which is
the single outstanding message supported per EE context. With XRC
there is no limit to the number of outstanding messages on the wire for a
XRC QP. In order to achieve this, XRC QPs operate similarly to regular
RC QPs on the requester side. There is no dynamic association between
a SQ and EE context (as in RD) for XRC. The implication of this is that
XRC QPs on the initiator side (denoted XRC INI QPs) are typically per
process and are statically bound (through the connection context) to a
single destination node.
The savings in the overall number of required QPs occurs because of the
way XRC operates on the responder side. The responder connection con-
text (denoted as XRC TGT QP) allows the requester process to send
messages targeting multiple destination QPs (denoted XRC SRQs) which
belong to the multiple processes on the destination node. So with a single
(XRC INI) QP a process in one node can communicate with ALL pro-
cesses on one remote node thus reducing by a factor of p the number of
overall QPs required for full connectivity (as compared to when RC QPs
are used).
XRC SRQs are the destination node per-process receive queues that can
be targeted from multiple remote endnodes through the XRC TGT QPs.
They are in a way equivalent to the receive queues in RD QPs and as
such there is only one required per process that allows it to receive mes-
sages from any process on any node in the cluster. This mode of opera-
tion requires the requester side (XRC INI QP) to specify which XRC SRQ
is being targeted for every posted request message. This information is
carried on the wire by means of a new extended transport header as de-
scribed in Section 14.3.2.1 on page 11.
In a similar way as RD QPs are limited to be used with RD EE contexts in
their same Reliable Datagram Domain (RDD), the XRC Transport Service
implements an equivalent XRC Domain mechanism that serves the same
purpose. XRC TGT QPs can only be used as a conduit to access XRC
SRQs that were setup on their same XRC Domain.
1. In some deployments, processes may not require QPs to communicate with
other processes on the same node. In such cases the calculations for number of
QPs required should use N-1 where N is used.
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 8 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
Due to the shared nature of XRC SRQs (Receive WQEs could be fetched
from multiple XRC TGT QPs), end to end credits can’t be carried on ACK
packets and the invalid credits code is used instead (similar to regular
SRQs).
As described above, it can be observed that XRC operates similarly to RC
on the requester side and as RD on the responder side and due to this
asymmetry there are a few unique characteristics for the transport objects
in the XRC Transport Service namely:
• XRC INI QPs are like regular RC QPs but do not have a re-
sponder side.
• XRC TGT QPs are similar to RD EECs but do not have a re-
quester side.
• XRC SRQs are like RD QPs but do not have a requester side.
It is worth noting, that due to this asymmetry, XRC communication through
a single XRC INI/TGT pair is one way.
One further aspect of XRC is that due to the fact that on the responder
side XRC SRQs have no SQs, local operations such as Window binding,
Fast register and Local invalidates are not covered and would need to be
executed on an auxiliary RC QP when needed. Also Type 2 Windows are
not supported with XRC.
From the above description it is straightfoward to calculate how many of
each of the XRC transport objects are required for full connectivity on a
cluster of N nodes with p processes per node:
• Each process needs N XRC INI QPs. One to communicate
with ALL processes at each remote node. Overall N*p XRC
INI QPs are required per node.
• Each process needs one XRC SRQ which allows it to receive
messages from any other process in the cluster. Overall p
XRC SRQs are required per node.
• Assuming a homogeneous cluster where all N nodes have p
processes, each node has N*p XRC TGT QPs which are the
counterparts of the XRC INI QPs at the remote nodes that are
used to target the processes in the local node.
So the total number of queues per node is: N*p XRC INI QPs, p XRC
SRQs and N*p XRC TGT QPs.
A14.2 GLOSSARY
XRC Acronym for the eXtended Reliable Connected transport service defined
in this annex.
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 9 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
XRC INI QP XRC Initiator QP. This is the initiator Queue for XRC operations. XRC INI
QPs are used to issue XRC outgoing requests and do not have a re-
sponder side. XRC incoming requests will be handled by XRC TGT QPs.
XRC TGT QP XRC Target QP. This is the responder for XRC operations. XRC TGT QPs
(together with XRC SRQs) are used to process incoming XRC requests.
XRC TG QPs do not have a requester side. XRC outgoing requests are
issued through XRC INI QPs.
XRC SRQ This is the Receive Queue where Receive WQEs are posted for incoming
XRC requests. XRC request packets carry in an extended header
(XRCETH) the XRC SRQ number that is being targeted and from which a
receive WQE will be fetched if required.
XRC Domain Attribute used to associate XRC TGT QPs and XRC SRQs. XRC packets
can only target XRC SRQs in the same XRC Domain as the XRC TGT QP
that they are destined for.
XRCETH XRC Extended Transport Header. Present in XRC request packets.
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 10 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.3 XRC TRANSPORT
A14.3.1 XRC OPCODES
Table 1 OpCode field
Code[7-5] Code[4-0] Description
Packet Contents following the Base
Transport header
a
101
Extended
Reliable
Connection
(XRC)
00000 SEND First XRCETH, PayLd
00001 SEND Middle XRCETH, PayLd
00010 SEND Last XRCETH, PayLd
00011 SEND Last with Immediate XRCETH, ImmDt, PayLd
00100 SEND Only XRCETH, PayLd
00101 SEND Only with Immediate XRCETH, ImmDt, PayLd
00110 RDMA WRITE First XRCETH, RETH, PayLd
00111 RDMA WRITE Middle XRCETH, PayLd
01000 RDMA WRITE Last XRCETH, PayLd
01001 RDMA WRITE Last with Immediate XRCETH, ImmDt, PayLd
01010 RDMA WRITE Only XRCETH, RETH, PayLd
01011 RDMA WRITE Only with Immediate XRCETH, RETH, ImmDt, PayLd
01100 RDMA READ Request XRCETH, RETH
01101 RDMA READ response First AETH, PayLd
01110 RDMA READ response Middle PayLd
01111 RDMA READ response Last AETH, PayLd
10000 RDMA READ response Only AETH, PayLd
10001 Acknowledge AETH
10010 ATOMIC Acknowledge AETH, AtomicAckETH
10011 CmpSwap XRCETH, AtomicETH
10100 FetchAdd XRCETH, AtomicETH
10101 Reserved Undefined
10110 SEND Last with Invalidate XRCETH, IETH, PayLd
10111 SEND Only with Invalidate XRCETH, IETH, PayLd
11000-11111 Reserved undefined
110 - 111 00000-11111 Manufacturer Specific OpCodes undefined
a. All OpCodes have the ICRC and VCRC attached.
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 11 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.3.2 XRC PACKET FORMAT
XRC Request Packets carry an additional extended transport header
XRCETH as described below.
A14.3.2.1 XRC EXTENDED TRANSPORT HEADER (XRCETH)
XRC Extended Transport Header (XRCETH) contains the Destination
XRC SRQ identifier.
CA14-1: If a CA implements the XRC transport service, then when gener-
ating a request packet, the sender shall set the XRCETH fields as de-
scribed in Section 14.3.2.1 XRC Extended Transport Header (XRCETH)
A14.3.2.1.1 RESERVED - 8 BITS
A14.3.2.1.2 XRCSRQ - 24 BITS
This field indicates the XRC Shared Receive Queue number to be used
by the responder for this packet.
A14.3.2.2 XRC PACKET FORMAT RULES
CA14-2: All packets of a single XRC message SHALL carry the same
exact XRCETH
CA14-3: The responder SHALL verify that the XRC Domain for the
XRCSRQ identified by first/only inbound packet is the same as the XRC
Domain of the XRC TGT QP.
The responder MAY verify that the XRCSRQ identified by middle/last
packet is the same as for all previous packets in the message.
CA14-4: The responder SHALL use the PD from XRCSRQ identified by
the XRC packet (instead of the PD of the XRC TGT QP).
CA14-5: If a receive WQE is required for the message, the responder
SHALL fetch it from the RQ of the XRCSRQ pointed by the message.
CA14-6: If a completion is to be generated for the message, the re-
sponder SHALL use the CQ from XRCSRQ identified by the XRC packet.
CA14-7: The E2E credits (MSN field) in XRC ACK packets SHALL be set
to ‘invalid’.
bits
bytes
31-24 23-16 15-8 7-0
0-3 Reserved XRCSRQ
Figure 2 XRC Extended Transport Header (XRCETH)
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 12 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.3.2.3 XRC PACKET EXAMPLES
ImmDt ImmDt
GRH
GRH
GRH
GRH
GRH
GRH
Field Name
LRH Local Route Header
GRH Global Route Header
BTH Base Transport Header
XRCETH eXtended Reliable Con-
nected Extended Trans-
port Header
ImmDt Immediate Extended
Transport Header
ICRC Invariant CRC
VCRC Variant CRC
Packet Header Field
present if necessary
Packet #1
VCRC ICRC
Packet #2
VCRC ICRC
Packet #3
ICRC VCRC
Packet Header Field
Figure 3 XRC SEND Operation Example
LRH
LRH
LRH
A 700 byte SEND Operation uses
3 packets, assuming a 256 Byte
PMTU. Acknowledgment
Packets, used for reliable trans-
port services, are not shown.
BTH Data Payload
BTH Data Payload
BTH
Data Payload
Packet BTH OpCode
a
a. The BTH OpCode field
determines that the XRCETH is
present (and whether or not the
ImmDt is present).
#1 “SEND First”
#2 “SEND Middle”
#3 “SEND Last” or “SEND
Last with Immediate”
XRCETH
XRCETH
XRCETH
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 13 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
Acknowledge Packets (and responses in general) are identical to those of
RC Transport Service (i.e. there is no XRCETH in the response packets).
ImmDt ImmDt
GRH
GRH
GRH
RETH GRH
GRH
GRH
Field Name
LRH Local Route Header
GRH Global Route Header
BTH Base Transport Header
XRCETH eXtended Reliable Con-
nected Extended Trans-
port Header
RETH RDMA Extended
Transport Header
ImmDt Immediate Extended
Transport Header
ICRC Invariant CRC
VCRC Variant CRC
Packet Header Field
present if necessary
Packet #1
BTH Data Payload VCRC ICRC
Packet #2
BTH VCRC ICRC
Packet #3
BTH
Data Payload ICRC VCRC
Packet Header Field
Figure 4 XRC RDMA WRITE Operation Example
LRH
LRH
LRH
A 700 byte RDMA WRITE Operation
uses 3 packets, assuming a 256 Byte
PMTU. Acknowledgment Packets, used
for reliable transport services, are not
shown.
Data Payload
Packet BTH OpCode
a
a. The BTH OpCode field
determines that the XRCETH is
present (and whether or not the
ImmDt is present).
#1 “RDMA WRITE First”
#2 “RDMA WRITE Middle”
#3 “RDMA WRITE Last” or
“RDMA WRITE Last with
Immediate”
XRCETH
XRCETH
XRCETH
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 14 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
GRH
GRH
GRH
GRH GRH
GRH
GRH
Packet Header Field
present if necessary
Response
Packet #1
Response
Packet #2
VCRC ICRC
Response
Packet #3
ICRC VCRC
Packet Header Field
Figure 5 XRC RDMA READ Operation Example
LRH
LRH
LRH
A 700 byte RDMA READ Operation
has 3 response packets, assuming a
256 Byte PMTU.
GRH
Request
Packet LRH
Packet BTH OpCode
Request “RDMA READ Request”
#1 “RDMA READ Response
First”
#2 “RDMA READ Response
Middle”
#3 “RDMA READ Response
Last”
AETH
Field Name
LRH Local Route Header
GRH Global Route Header
BTH Base Transport Header
XRCETH eXtended Reliable Con-
nected Extended Trans-
port Header
AETH Acknowledgment
Extended Transport
Header
RETH RDMA Extended
Transport Header
ICRC Invariant CRC
VCRC Variant CRC
BTH
BTH Data Payload
BTH Data Payload
BTH VCRC ICRC
R
D
M
A
R
E
A
D
R
e
q
u
e
s
t P
a
c
k
e
t
R
D
M
A
R
E
A
D
R
e
s
p
o
n
s
e
P
a
c
k
e
t #
1
R
D
M
A
R
E
A
D
R
e
s
p
o
n
s
e
P
a
c
k
e
t #
2
R
D
M
A
R
E
A
D
R
e
s
p
o
n
s
e
P
a
c
k
e
t #
3
A ladder diagram showing the single RDMA READ
Request Packet initiated by the requestor node. In
this example, the destination node segments the
data into three response packets.
RETH
AETH VCRC ICRC Data Payload
XRCETH
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 15 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.3.3 ERROR DETECTION AND HANDLING
CA14-8: If a CA implements the XRC transport service, then Error Detec-
tion and Handling should be as described in Section 14.3.3.1 Summary -
Requester Side Error Behavior and Section 14.3.3.2 Summary - Re-
sponder Side Error Behavior.
GRH
GRH
GRH
Packet Header Field
present if necessary
Acknowl-
edgment
Packet
ICRC VCRC
Packet Header Field
Figure 6 XRC ATOMIC Operation Example
LRH
GRH
Request
Packet LRH
AETH
Field Name
LRH Local Route Header
GRH Global Route Header
BTH Base Transport Header
XRCETH eXtended Reliable Con-
nected Extended Trans-
port Header
AETH Acknowledgment Extended
Transport Header
AtomicETH ATOMIC Request
Extended Transport
Header
AtomicAck-
ETH
ATOMIC Acknowledgment
Extended Transport
Header
ICRC Invariant CRC
VCRC Variant CRC
BTH AtomicAckETH
BTH VCRC ICRC
A
T
O
M
IC
C
o
m
m
a
n
d
R
e
q
u
e
s
t P
a
c
k
e
t
A
T
O
M
IC
A
c
k
n
o
w
le
d
g
m
e
n
t
P
a
c
k
e
t
A ladder diagram showing the “ATOMIC
Command” Request Packet and the re-
turning “ATOMIC Acknowledge” response
AtomicETH XRCETH
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 16 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.3.3.1 SUMMARY - REQUESTER SIDE ERROR BEHAVIOR

Table 2 Requester Side Error Behavior
Error Description Syndrome
Requestor Fault
Behavior Class
Packet sequence error.
Retry limit not exceeded.
Responder detected a PSN larger than it
expected.
Requester may retry the request.
NAK-Sequence
Error
XRC: Class A
a
Packet sequence error.
Retry limit exceeded.
Responder detected a PSN larger than it
expected.
The requestor performed retries, and auto-
matic path migration and additional retries,
if applicable, but all attempts failed.
NAK-Sequence
Error
XRC: Class B
Implied NAK sequence
error. Retry limit not
exceeded.
Requestor detected an ACK with a PSN
larger than the expected PSN for an RDMA
READ or ATOMIC response.
Requester may retry the request.
locally detected error XRC: Class A
Implied NAK sequence
error. Retry limit
exceeded.
Requestor detected an ACK with a PSN
larger than the expected PSN for an RDMA
READ or atomic response.
The requestor performed retries, and auto-
matic path migration and additional retries,
if applicable, but all attempts failed.
locally detected error XRC: Class B
Local Ack Timeout error.
Retry limit not exceeded.
No ACK response from responder within
timer interval.
Requester may retry the request.
locally detected error XRC: Class A
Local Ack Timeout error.
Retry limit exceeded.
No ACK response within timer interval. The
requestor performed retries, and automatic
path migration and additional retries, but all
attempts failed.
locally detected error XRC: Class B
RNR NAK Retry error.
Retry limit not exceeded.
Responder returned RNR NAK.
Requestor may retry the request.
RNR-NAK XRC: Class A
RNR NAK Retry error.
Retry limit exceeded.
Excessive RNR NAKs returned by the
responder.
Requestor retried the request “n” times, but
received RNR NAK each time.
locally detected error XRC: Class B
Unsupported OpCode. Responder detected an unsupported
OpCode.
NAK-Invalid Request XRC: Class B
Unexpected OpCode. Responder detected an error in the
sequence of OpCodes, such as a missing
“Last” packet. Note: there is no PSN error,
thus this does not indicate a dropped
packet.
NAK-Invalid Request XRC: Class B
Local Memory Protection
Error.
Requester detected an implementation
specific memory protection error in its local
memory subsystem.
locally detected error XRC: Class B
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 17 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
R_Key Violation Responder detected an invalid R_Key while
executing an RDMA Request
NAK-Remote
Access Error
XRC: Class B
Remote Operation Error Responder encountered an error, (local to
the responder), which prevented it from
completing the request.
NAK-Remote Opera-
tion Error
XRC: Class B
Local Operation Error
b
-
WQE
An error occurred in the requester’s local
channel interface that can be associated
with a certain WQE.
locally detected error XRC Class B
Local Operation Error
b
-
affiliated or unaffiliated
An error occurred in the requester’s local
channel interface that cannot be associated
with a certain WQE.
locally detected error XRC: Class C
Remote XRC Domain
Violation
Responder’s Receive Queue detected a
XRC Domain violation
NAK-Invalid RD
c

Request
XRC: Class B
Remote XRCETH Viola-
tion
Responder’s Receive Queue detected a
XRCETH violation (e.g. XRCSRQ does not
exist or in wrong state, XRCETH in mid-
dle/last is different than XRCETH in
first/only)
NAK-Invalid RD
c

Request
XRC: Class B
Length error RDMA READ response message contained
too much or too little payload data.
locally detected error XRC: Class B
Bad response Unexpected opcode for the response
packet received at the expected response
PSN.
d
locally detected error XRC: Class B
Ghost Acknowledge Requester received an acknowledge mes-
sage at other than the expected response
PSN.
locally detected error XRC: Class E
e
CQ overflow Despite actual execution of the message,
and acknowledgement, the completion noti-
fication could not be written to the CQ.
locally detected error XRC: Class F
a. where Class A as defined in chapter 9 is interpreted to now support XRC INI QPs also (i.e. wherever it reads RC
it should be interpreted as RC or XRC INI).
b. Local operations errors tend to be very implementation specific; not all CA’s may have or detect these.
c. we are deliberately “overloading” this RD specific NAK syndrome as it matches the spirit of the error in question.
There is no possible confusion since this is a XRC QP.
d. For example; RDMA read instead of Acknowledge, NAK code in AETH of an RDMA read, or “RDMA READ
Response last” instead of middle. Note that there are specific exceptions that an implementation may elect to not
treat as a bad response which are: Out of order RDMA READ Response Opcodes in the case where the requester
had generated a duplicate RDMA READ Request, or the reception of an ACK instead of an RDMA READ Response
of Atomic Response. This ACK may be the result of an unsolicited ACK sent by the responder that arrives at the
requester before the expected RDMA READ or Atomic Response. The requester may drop this ACK packet with no
ill effects.
e. where Class E as defined in chapter 9 is interpreted to now support XRC INI QPs also (i.e. wherever it reads RC
it should be interpreted as RC or XRC INI).
Table 2 Requester Side Error Behavior (Continued)
Error Description Syndrome
Requestor Fault
Behavior Class
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 18 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.3.3.2 SUMMARY - RESPONDER SIDE ERROR BEHAVIOR
Table 3 Responder Error Behavior Summary
Error Description Service Syndrome
Fault
Behavior
Class
Malformed WQE Responder detected a malformed
Receive Queue WQE while processing
the packet.
XRC NAK-Remote Operational Error Responder Class A
Unsupported or
Reserved
OpCode
Inbound request OpCode was either
reserved, or was for a function not sup-
ported by this QP. E.G. RDMA or ATOMIC
on QP not set up for this. For RC this is
“QP Async affiliated”
XRC
NAK-Invalid Request
Responder Class K
Misaligned
ATOMIC
VA does not point to an aligned address
on an atomic operation
XRC
NAK-Invalid Request
Responder Class K
Too many RDMA
READ or ATOMIC
Requests
There were more requests received and
not ACKed than allowed for the connec-
tion
XRC
NAK-Invalid Request
Responder Class K
Out of Sequence
Request Packet
PSN of the inbound request is outside the
responder’s valid PSN window.
XRC NAK-Sequence error Responder Class B
Out of Sequence
OpCode, current
packet is “first” or
“Only”
The Responder detected an error in the
sequence of OpCodes; a missing “Last”
packet
XRC NAK-Invalid Request Responder Class K
Out of Sequence
OpCode, current
packet is not “first”
or “Only”
The Responder detected an error in the
sequence of OpCodes; a missing “First”
packet
XRC
NAK-Invalid Request
Responder Class K
R_Key Violation Responder detected an R_Key violation
while executing an RDMA request.
XRC NAK-Remote Access Violation Responder Class K
Local QP Error Responder detected a local QP related
error while executing the request mes-
sage. The local error prevented the
responder from completing the request.
The local QP includes the shared receive
queue, if one exists.
A local QP error also occurs if a receive
queue which is associated with a shared
receive queue is unable to fetch a WQE
from the shared receive queue due to an
error condition in the shared receive
queue.
XRC NAK-Remote Operational Error Responder Class K
Packet Header
Violation
Responder detected a header violation
that requires a silent drop as described in
9.6 Packet Transport Header Validation
on page 272
XRC none Responder Class D
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 19 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
XRC Domain Vio-
lation
Responder’s Receive Queue detected an
invalid XRC Domain
XRC NAK-Invalid RD Request
a
Responder Class K
Invalid XRCETH XRC SRQ does not exist or is not in the
right state or XRCETH in middle/last is
different than XRCETH in first/only
XRC NAK-Invalid RD Request
a
Responder Class K
Resources Not
Ready Error
A WQE or other resource is not currently
available.
XRC RNR-NAK Responder Class B
Length errors 1) Inbound “Send” request message
exceeded the responder’s available buf-
fer space: “Local Length Error”
2) RDMA WRITE request message con-
tained too much or too little payload data
compared to the DMA length advertised
in the first or only packet.
3) Payload length was not consistent with
the opcode:
a: 0 byte <= “only” <= PMTU bytes
b: (“first” or “middle”) == PMTU bytes
c: 1byte <= “last” <= PMTU bytes
4) Inbound message exceeded the size
supported by the CA port
XRC
NAK-Invalid Request
Responder Class K
Invalid duplicate
ATOMIC Request
A duplicate ATOMIC request packet is
received, but the PSN does not match the
PSN of a saved ATOMIC Request.
XRC none Responder Class D
CQ overflow Despite actual execution of the message,
and acknowledgement, the completion
notification could not be written to the
CQ.
XRC none Responder Class L
Local XRC TGT
QP Error
Responder detected a local XRC TGT
QP related error while executing the
request message. The local error pre-
vented the responder from completing
the request.
XRC
none
Responder Class K
Remote Invali-
date Error
Incoming Send with Invalidate contains
an invalid R_Key, or the R_Key contained
in the IETH cannot be invalidated.
XRC NA Responder Class M
a. we are deliberately “overloading” this RD specific NAK syndrome as it matches the spirit of the error in question. There is no
possible confusion since this is a XRC QP.
Table 3 Responder Error Behavior Summary (Continued)
Error Description Service Syndrome
Fault
Behavior
Class
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 20 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.3.3.2.1 RESPONDER CLASS K FAULT BEHAVIOR
For an HCA XRC TGT QP, for a Class K responder side error, the error
shall be reported to the requester by generating the appropriate NAK code
as specified in Table 3 Responder Error Behavior Summary on page 18.
The error shall be reported to the responder’s client as an Affiliated Asyn-
chronous error. See Section Section 14.5.4.3 on page 41 for details and
the XRC TGT QP shall be placed into the error state. In addition, If the
error can be related to a particular WQE on a given XRC SRQ then a
Completion error is generated in the XRC SRQ CQ. As a result of placing
the XRC TGT QP into the error state, other receive WQEs in the same or
different XRC SRQ may be completed in error.
A14.3.3.2.2 RESPONDER CLASS L FAULT BEHAVIOR
A Class L error occurs when the CQ is inaccessible or full and an attempt
is made to complete a WQE on a XRC SRQ.
The XRC TGT QP through which the packet was received shall be moved
to the error state and affiliated asynchronous errors generated as de-
scribed in Section 14.5.4.3 on page 41. The current WQE and any subse-
quent WQEs on the XRC SRQ in question are left in an unknown state
and the XRC SRQ is moved to the error state.
As a result of placing the XRC TGT QP into the error state, other receive
WQEs on different XRC SRQ may be completed in error.
A14.3.3.2.3 RESPONDER CLASS M FAULT BEHAVIOR
This class of locally detected error occurs only for XRC TGT QPs. A Class
M error is reported to the responder side client, but is not reported via a
NAK code to the requester.
Table 4 Summary of XRC TGT QP Additional Responder Fault Class Behaviors
Fault Behavior Class NAK Codes Returned Current Receive WQE
a
Subsequent
Receive WQEs
Final Receive
Queue State
Responder Class K Invalid Request
Invalid RD Request
Remote Access Violation
Remote Operational Error
One or more WQEs may be
completed in error
NA
b
error state
Responder Class L none One or more WQEs may be
completed in error
NA
b
error state
Responder Class M None Completed in error NA
b
error state
a. A WQE is only completed if open for Sends and RDMA WRITE with Immediate data.
b. Receive WQEs are never posted to XRC TGT QPs
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 21 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A responder class M error occurs when a responder receives a SEND
with Invalidate request (either SEND last with Invalidate or SEND only
with Invalidate) which contains an invalid R_Key in the IETH.
In terms of error precedence, all other errors associated with validating
the packet headers and executing the SEND operation onto which the In-
validate is piggybacked are reported before an R_Key violation is re-
ported.
The following statements constitute the requirements for detecting and re-
porting an R_Key violation associated with a SEND with Invalidate re-
quest:
1) An R_Key violation, if one occurs, shall not be reported until such
time as the SEND request onto which the invalidate has been piggy-
backed has been successfully executed. Any errors resulting from
executing the SEND request must be reported before an error re-
sulting from the invalidate operation is reported.
2) If an error occurs due to execution of the underlying SEND operation,
no error related to the invalidate operation shall be reported.
3) If the underlying SEND operation executes normally, a receive WQE
shall be consumed regardless of the success or failure of the asso-
ciated invalidate operation. In other words, as a result of executing
the underlying SEND request onto which the invalidate has been
piggy-backed, a receive WQE will have been consumed. Thus, even
if the invalidate operation fails, the receive WQE is always con-
sumed.
4) The receive WQE which receives the SEND with Invalidate request
shall not be completed until the corresponding invalidate operation
has been completed.
5) If an error is detected in the course of executing the invalidate oper-
ation, the following actions shall occur:
a) The WQE that experienced the SEND with Invalidate error is
marked as completed in error.
b) The XRC TGT QP is transitioned to the Error State and affiliated
asynchronous errors generated as described in Section 14.5.4.3
on page 41.
c) As a result of placing the XRC TGT QP into the error state, other
receive WQEs on different XRC SRQ may be completed in error.
d) No NAK is sent to the Requestor.
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 22 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.3.4 HEADER AND DATA FIELD SOURCE
A14.3.4.1 FIELD SOURCE WHEN GENERATING PACKETS
The following tables provide an indication of the source of the various
header and data fields in the data packets for the various IBA services.
The following terms are used in the table:
Link This indicates the value is attached to the packet based on either a fixed
value, or values dependent on the service, or values looked up based on
parameters loaded into the logical port. Done by the link layer.
Tr This indicates that the value is fixed or calculated by the transport layer.
XRC
QP
This indicates that the value is derived from the XRC INI or XRC TGT QP
context
XRC
SRQ
This indicates that the value is checked against values from the XRC SRQ
context
NA Not Applicable
WQE The value is directly or indirectly (via Address vector) derived from infor-
mation in the WQE
Table 5 Packet Fields and Parameters by Service
Parameter Description XRC
LRH VL The VL to use for requests. Based on SL and the port
SL to VL mapping table.
link
LRH LVer The version of the link level. This field depends on the
revision of the device.
link
LRH SL The SL to use for requests and responses XRC
QP
LRH LNH IBA IBA transport bit, indicates that BTH follows 1
LRH LNH GRH GRH bit, indicates that a GRH follows XRC
QP
LRH DLID Destination local ID used for routing XRC
QP
LRH Packet Length Length of the local packet; calculated by the transport
based on the message length.
WQE
LRH SLID (high bits not
covered by LMC)
Source local ID in outgoing packets. From the port.
With LMC low order bits (0s) added, the value is
called “Base LID”.
link
LRH SLID (low bits cov-
ered by the LMC)
Source logical ID in outgoing packets. These LMC (as
a number) bits are called the “path” bits.
XRC
QP
GRH IPVer” CA’s set to 6 Tr
GRH Tclass CA’s set to 0; it will then be loaded with another value
at the first encountered router.
Alternately set according to application.
XRC
QP
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 23 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
GRH FlowLabel CA’s set to 0; it will then be loaded with another value
at the first encountered router.
Alternately set according to application.
XRC
QP
GRH Paylen Length of the global packet; calculated by the trans-
port based on the message length.
WQE
GRH NxtHdr CA’s set to IBA (0x1B) Tr
GRH HopLmt CA’s set to 0; it will then be loaded with another value
at the first encountered router.
Alternately set according to application.
XRC
QP
GRH SGID Source Global ID, from the port table and the index
found in:
XRC
QP
GRH DGID Destination Global ID XRC
QP
BTH OpCode Depends on operation, set by the transport layer. Tr
BTH TVer The version of the transport. This field depends on the
revision of the device (0).
Tr
BTH P_Key Partition Key, from the port table and the Index found
in:
XRC
QP
BTH DestQP Destination QP *For RD mode responses, this is from
the Request Packet Source QP as stored in the EEC
XRC
QP
BTH Pad Length of packet pad; used to calculate actual data
size. Calculated by the transport layer based on data
size.
WQE
BTH SE Solicited Event WQE
BTH M Migrate. Set by the transport dependent on the migra-
tion state.
Tr
BTH AckReq Acknowledge request Tr
BTH PSN Packet Sequence Number XRC
QP
RDETH EEC Destination EE Context NA
DETH Q_Key Key which protects datagram QPs NA
DETH Source QP Source QP. Set by transport for datagram services. NA
RETH All fields of the RDMA Extended Transport Header
(when used) are taken from the WQE
WQE
XRCETH XRCSRQ Destination XRC SRQ WQE
AtomicETH All fields of the ATOMIC Extended Transport Header
(when used) are taken from the WQE
WQE
Table 5 Packet Fields and Parameters by Service (Continued)
Parameter Description XRC
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 24 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.3.4.2 TRANSPORT CONNECTION PARAMETERS
The following are not sent “on the wire” but are needed to implement the
protocol. This table is included to provide a better understanding of the pa-
rameters used by the transport layer to provide a connection. This list only
AETH MSN Message Sequence number (ACKs only) XRC
QP
AETH Syndrome Acknowledge syndrome, computed based on opera-
tion for reliable services
XRC
QP+
XRC
SRQ
AETH RNR-NAK timer
(TTTTT)
This value is placed in the AETH.TTTTT field when
sending an RNR NAK. It denotes the minimum time to
wait before retrying the request.
XRC
QP
AETH credit count
(CCCCC)
This value is placed in the AETH.CCCCC field when
sending an Ack in RC mode. It denotes the number of
receive WQEs available to receive Send or RDMA
write with immediate messages.
NA
a
AtomicAckETH ATOMIC data returned; the data is loaded as defined
by the R_Key and Virtual Address, stored per WQE
WQE
IETH R_Key This is the R_KEY that the responder is being asked
to invalidate in a SEND with Invalidate operation.
WQE
Immediate data Dependent on operation WQE
Payload Dependent on operation WQE
ICRC Calculated by transport; data dependent link
VCRC Calculated by Link layer; data dependent link
a. XRC responses carry invalid e2e credits
Table 5 Packet Fields and Parameters by Service (Continued)
Parameter Description XRC
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 25 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
covers elements mentioned in the IBA specification, other elements may
be needed to completely implement connections.
Table 6 Connection Parameters by Transport Service
Parameter Description XRC
Connect state State of connection (Reset, RTR, RTS, Error etc.) XRC
QP +
XRC
SRQ
Port number Used only if there is more than a single port XRC
QP
Global/Local header Determines if global header is to be attached or not. XRC
QP
MTU Max Size of the packets allowed on this connection. XRC
QP
RNR NAK retry time Time before performing a retry due to RNR; this is ini-
tialized by the AETH.TTTTT field in the RNR-NAK,
and counts down from there.
XRC
QP
RNR Retry init Send Queue RNR retry count Initialization value XRC
QP
RNR Retry counter Send Queue RNR Retry counter value XRC
QP
Local ACK Timeout The exponent used to calculate the delay before an
ACK is declared “lost”.
XRC
QP
Error Retries Send Queue retry count for sequence or time-out
errors
XRC
QP
MigState Migration State (Migrated, Armed, ReArm) XRC
QP
Disable_E2E_Credits Send queue use E2E protocol (depends on remote
side’s ability to send credits)
NA
a
Path Speed (IPD) Controls packet emission for slower links XRC
QP
PD Protection Domain for this QP XRC
SRQ
XRC Domain XRC Domain XRC
QP+
XRC
SRQ
XmitPSN Sequence number used when sending XRC
QP
AckPSN Sequence number expected for the ACKs XRC
QP
Rx ePSN Sequence number expected when receiving XRC
QP
RxAckPSN Number of unacknowledged Rx packets XRC
QP
SSN Transmit messages Sent Sequence Number XRC
QP
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 26 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.3.4.3 PACKET HEADER AND DATA FIELD VALIDATION
The following tables provide an indication of the validation responsibility
of the various header and data fields in the data packets for the various
IBA services. The following terms are used in the table:
Rx MSN Message Sequence Number XRC
QP
Rx credits Rx queue elements posted XRC
SRQ
LSN Limit Sequence number (credit accounting) XRC
QP
SchQP_dequeue QP at head of schedule queue (RD mode) NA
SchQP_enqueue QP at tail of schedule queue (RD mode) NA
SchQP_Next Pointer to next QP to be scheduled (RD mode) NA
Num_RDMA_Reads Number of RDMA READs or ATOMICs supported by
remote side
XRC
QP
RDMAR/VA/R_Key/Size or
ATOMIC “result”
The “hidden” stored address(s) of RDMA READ
request(s) or ATOMIC results
XRC
QP
RDMA PSN# or
ATOMIC PSN #
Sequence number of requested op, used to match
response on a repeat, and store reply PSN
XRC
QP
RDMAR/ATOMIC Use Usage of the resource; 1=RDMAR, 0=ATOMIC XRC
QP
Rx Completion Q XRC
SRQ
Tx Completion Q XRC
INIQP
Tx WQE pointer Points to current Send WQE and its data segments for
requests
XRC
QP
Tx ACK WQE pointer Points to current Send WQE and its data segments for
Completions
XRC
QP
Rx WQE pointers Points to current Receive descriptor XRC
SRQ
a. e2e credits are always invalid for XRC responses
Table 6 Connection Parameters by Transport Service
Parameter Description XRC
Link This indicates the value is checked by the link layer.
Tr This indicates that the value is checked against fixed values or used by
the transport layer to select among choices.
XRC
QP
This indicates that the value is checked against values from the XRC INI
or XRC TGT QP context
XRC
SRQ
This indicates that the value is checked against values from the XRC SRQ
context
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 27 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
NC The value is Not checked
NA Not Applicable
WQE The value is checked against information derived from the WQE
Table 7 Packet Fields Validation source by Service
Parameter Description XRC
LRH VL The VL on incoming packet. link
LRH LVer The version of the link level. This field depends on the
revision of the device.
link
LRH SL The SL to use for requests NC
LRH LNH IBA IBA transport bit, indicates that BTH follows Tr(1)
LRH LNH GRH GRH bit, indicates that a GRH follows XRC
QP
LRH DLID Destination local ID used for routing
This is always checked at the link layer against Base
LID and LMC.
link
XRC
QP
LRH Packet Length Length of the local packet; checked against MTUCap
and NeighborMTU at link, valid packet size at Trans-
port, and data buffer size and protection values.
WQE
LRH SLID Source local ID in ongoing packets. XRC
QP
GRH IPVer” Checked for the value ’6’ Tr
GRH Tclass Traffic Class NC
GRH FlowLabel Flow label NC
GRH Paylen Length of the global packet; may be checked against
PMTU and LRH Packet Length at link, valid packet
size at Transport, and data buffer size and protection
values.
WQE
GRH NxtHdr Checked for the value 0x1B Tr
GRH HopLmt Hop Limit NC
GRH SGID Source Global ID XRC
QP
GRH DGID Destination Global ID XRC
QP
BTH OpCode Depends on operation Tr
BTH TVer The version of the transport. Tr
BTH P_Key Partition Key; checked against the port partition table
a
XRC
QP
BTH DestQP Destination QP; checked against the valid set and QP
mode by transport.
Tr
BTH Pad Length of packet pad; supplements LRH Packet
Length.
WQE
BTH SE Solicited Event; passed to upper layers for each mes-
sage
Tr
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 28 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.4 XRC SOFTWARE TRANSPORT INTERFACE
The XRC transport service offers semantics that are similar to those of RC
on the requester side and RD on the responder side. To that end new
transport service objects are defined in this annex that closely resemble
aspects of the existing ones (RC and RD) namely:
• XRC INI QP
BTH M Migrate. Checked and used by transport to select
alternate path parameters
Tr
BTH AckReq Acknowledge request Tr
BTH PSN Packet Sequence Number XRC
QP
RDETH EEC Destination EE Context; checked against the valid set
and EE mode by transport.
Tr
DETH Q_Key Key which protects datagram QPs NA
DETH Source QP Source QP. Passed to upper layers for each mes-
sage.
NC
RETH All fields of the RDMA Extended Transport Header
(when used) are validated against protection parame-
ters associated with QP state.
XRC
SRQ
AtomicETH All fields of the ATOMIC Extended Transport Header
(when used) are validated against protection parame-
ters associated with QP state.
XRC
SRQ
AETH MSN Message Sequence number (ACKs only) Tr
AETH Syndrome Acknowledge syndrome Tr
AtomicAckETH Atomic data returned; Passed to upper layers for each
message.
NC
IETH R_Key This is the R_KEY that the responder is being asked
to invalidate in a SEND with Invalidate operation.
b
XRCETH XRCSRQ This is the XRC SRQ that is being targeted by the
incoming packet in question
XRC
SRQ
Immediate data Dependent on operation; Passed to upper layers for
each message.
NC
Payload Dependent on operation; Passed to upper layers for
each message.
NC
ICRC Checked by transport link
VCRC Checked by Link layer; data dependent link
a. For QP1, the P_Key need only be a member of the port’s Partition table, it is not checked
against a QP index.
b. See details in 9.4.1.1.3 R_Key Validation for Remote Memory Invalidate on page 254
Table 7 Packet Fields Validation source by Service (Continued)
Parameter Description XRC
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 29 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
• XRC TGT QP
• XRC SRQ
• XRC Domain
Specifically, we use a XRC INI QP on the requester side connected to an
XRC TGT QP on the responder side. XRC INI and XRC TGT QPs are al-
located from the overall number of QPs that the HCA supports.
Each XRC message can target one XRC SRQ on the responder side
which is the actual RQ where receive WRs are posted. The choice of XRC
SRQ is on a per WR basis and the selected XRC SRQ number is a new
input modifier to the PostSend verb for XRC INI QPs. The selected XRC
SRQ number is carried on the XRC packets using a new extended trans-
port header as defined in Section 14.3.2.1 on page 11. XRC SRQs are al-
located from a dedicated set as specified in Section 14.5.2.1.1 Query
HCA
As implied by their names, XRC INI QPs do not have a responder side and
XRC TGT QPs do not have a requester side. XRC SQ WRs (requests) are
posted to XRC INI QPs while XRC RQ WRs are posted to XRC SRQs
(and accessed through XRC TGT QPs).
XRC SRQs and XRC TGT QPs are associated by means of a XRC Do-
main which is identical in concept to the RDD of the RD transport service.
A XRC packet that is received through a XRC TGT QP will be successfully
processed if and only if the XRC Domain for the XRC TGT QP is the same
as that of the XRC SRQ that the packet is targeting.
Since the responder CQ is the one defined for the XRC SRQ that is being
targeted, there are no ordering guarantees for messages coming through
a single XRC TGT QP that go to different XRC SRQs.
XRC SRQs do not support type 2 memory windows.
XRC INI and XRC TGT QPs follow the same QP states as RC QPs.
It is worth noting that even though XRC TGT QPs never generate re-
quests, they still go to the RTS state for the purposes of path migration.
Error semantics follow a similar model than that of RC QPs with a few ex-
ceptions as described in the error section of this Annex. Upon errors, XRC
TGT QPs generate asynchronous events. In addition, when relevant,
XRC receive WQEs are flushed in error and those errors are affiliated with
the SRQ to which the WQEs were originally posted.
PDs follow a similar model to that of RD QPs. Incoming XRC requests al-
ways use the PD of the XRC SRQ indicated in the XRC request (including
for RDMAs where no receive WQE from the XRC SRQ is consumed).
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 30 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.5 XRC SOFTWARE TRANSPORT VERBS
A14.5.1 VERBS OVERVIEW
A14.5.2 TRANSPORT RESOURCE MANAGEMENT
A14.5.2.1 HCA
A14.5.2.1.1 QUERY HCA
New output modifiers are added to the QUERY HCA verb to indicate sup-
port for the XRC Transport Service as follows:
Output Modifiers:
• Maximum number of XRC domains supported by this HCA.
Shall be zero if the HCA does not support XRC.
• If HCA supports the XRC Shared Receive Queues:
• Maximum number of XRC SRQs.
• Maximum number of WRs per XRC SRQ.
• Maximum number of Scatter/Gather entries per XRC SRQ
WR.
• Ability to modify the maximum number of WRs per XRC
SRQ.
Table 8 Verb Classes
Verb Mandatory/Optional
Classification
Consumer
Accessibility
Allocate XRC Domain XRC transport service Privileged
Deallocate XRC Domain XRC transport service Privileged
Create XRC Shared Receive Queue XRC transport service Privileged
Query XRC Shared Receive Queue XRC transport service Privileged
Modify XRC Shared Receive Queue XRC transport service Privileged
Destroy XRC Shared Receive Queue XRC transport service Privileged
Create XRC Target Queue Pair XRC transport service Privileged
Query XRC Target Queue Pair XRC transport service Privileged
Modify XRC Target Queue Pair XRC transport service Privileged
Destroy XRC Target Queue Pair XRC transport service Privileged
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 31 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.5.2.1.2 ALLOCATE XRC DOMAIN
Description:
Allocates an unused XRC Domain object. XRC domain objects are re-
quired when setting up a XRC TGT Queue Pair and XRC SRQs. A
XRC Domain object provides an association between XRC TGT
Queue Pairs and XRC SRQs. Operations on a XRC TGT QP directed
at a XRC SRQ are allowed only when the XRC Domain of the XRC
TGT QP and the XRC domain of the XRC SRQ are identical.
Input Modifiers:
• HCA Handle.
Output Modifiers:
• XRC domain object.
• Verb Results:
• Operation completed successfully.
• Insufficient resources to complete request.
• Invalid HCA handle.
• XRC not supported.
A14.5.2.1.3 DEALLOCATE XRC DOMAIN
Description:
Returns a previously allocated XRC domain object for reuse by the Al-
locate XRC Domain Verb. The XRC domain object cannot be deallo-
cated if it is still associated with a Queue Pair or a SRQ.
Input Modifiers:
• HCA Handle.
• XRC domain object.
Output Modifiers:
• Verb Results:
• Operation completed successfully.
• Invalid XRC Domain.
• XRC domain is in use.
• Invalid HCA handle
• XRC not supported.
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 32 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.5.2.2 XRC SRQS
A14.5.2.2.1 CREATE XRC SHARED RECEIVE QUEUE
Description:
Creates an XRC SRQ for the specified HCA.
A set of initial XRC SRQ attributes must be specified by the Con-
sumer.
o0-0.2.1: If the CI supports XRC SRQ and any of the required initial attri-
butes are illegal or missing, an error shall be returned and the XRC SRQ
shall not be created.
On success, a handle to the newly created XRC SRQ is returned.
Input Modifiers:
• HCA handle.
• XRC Domain
• Protection Domain. As opposed to regular SRQs where the QP
PD is used for memory access validation, for XRC SRQs it is the
XRC SRQ PD the one used for that purpose.
• CQ Handle. As opposed to regular SRQs where completions are
reported to the CQ associated with the QP, completions for XRC
SRQs are reported to the SRQ CQ.
• The maximum number of outstanding Work Requests the Con-
sumer expects to submit to the XRC Shared Receive Queue.
• The maximum number of scatter elements the Consumer will
specify in a Work Request submitted to the XRC Shared Receive
Queue.
• Enable or disable Reserved L_Key operations.
Output Modifiers:
• The XRC SRQ handle for the newly created XRC SRQ.
• XRC SRQ number
• The actual number of outstanding Work Requests supported on
the XRC Shared Receive Queue. If an error is not returned, this is
guaranteed to be greater than or equal to the number requested.
(This may require the Consumer to increase the size of the CQ.)
• The actual number of scatter elements that can be specified in
Work Requests submitted to the XRC Shared Receive Queue. If
an error is not returned, this is guaranteed to be greater than or
equal to the number requested.
• Verb Results:
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 33 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
• Operation completed successfully.
• Insufficient resources to complete request.
• Invalid HCA handle.
• Invalid CQ handle.
• Maximum number of Work Requests requested exceeds HCA
capability.
• Maximum number of scatter elements requested exceeds
HCA capability.
• Invalid XRC Domain
• Invalid Protection Domain.
• HCA does not support XRC SRQ.
A14.5.2.2.2 QUERY XRC SHARED RECEIVE QUEUE
Description:
Returns the attributes of the specified XRC SRQ.
Input Modifiers:
• HCA handle.
• XRC SRQ Handle.
Output Modifiers:
• XRC SRQ number
• XRC Domain
• Protection Domain.
• CQ Handle
• The actual number of outstanding Work Requests supported on
the XRC Shared Receive Queue.
• The actual number of scatter elements that can be specified in
Work Requests submitted to the XRC Shared Receive Queue.
• XRC SRQ Limit. If the XRC SRQ Limit is armed, returns the cur-
rent XRC SRQ Limit. If the XRC SRQ is not armed, returns zero.
• Verb Results:
• Operation completed successfully.
• Invalid HCA handle.
• Invalid XRC SRQ handle.
• XRC SRQ is in the Error State.
• HCA does not support XRC SRQ.
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 34 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.5.2.2.3 MODIFY XRC SHARED RECEIVE QUEUE
Description:
Modifies the attributes of an XRC SRQ for the specified HCA.
If any of the modify attributes are invalid, none of the attributes shall
be modified.
Input Modifiers:
• HCA handle.
• XRC SRQ handle.
• The XRC SRQ attributes to modify and their new values. The
XRC SRQ attributes that can be modified after the XRC SRQ has
been created are:
• The maximum number of outstanding Work Requests the
Consumer expects to submit to the XRC Shared Receive
Queue, if resizing of the XRC SRQ is supported by the HCA.
• XRC SRQ Limit. If the XRC SRQ Limit is greater than zero,
then it shall be armed upon returning from this verb.
Output Modifiers:
• The actual number of outstanding Work Requests supported on
the XRC Shared Receive Queue. If an error is not returned, this is
guaranteed to be greater than or equal to the number requested.
(This may require the Consumer to increase the size of the CQ.)
• Verb Results:
• Operation completed successfully.
• Insufficient resources to complete request.
• Invalid HCA handle.
• Invalid XRC SRQ handle.
• XRC SRQ is in the Error State.
• HCA does not support resizing XRC SRQ.
• Maximum number of Work Requests requested exceeds HCA
capability.
• XRC SRQ Limit exceeds maximum number of Work Requests
allowed on the XRC SRQ.
• More outstanding entries on WQ than size specified.
• HCA does not support XRC SRQ.
A14.5.2.2.4 DESTROY XRC SHARED RECEIVE QUEUE
Description:
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 35 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
Destroys an XRC SRQ for the specified HCA.
Input Modifiers:
• HCA handle.
• XRC SRQ handle.
Output Modifiers:
• Verb Results:
• Operation completed successfully.
• Invalid HCA handle.
• Invalid XRC SRQ handle.
• HCA does not support XRC SRQ.
A14.5.2.3 XRC TARGET QP
A14.5.2.3.1 CREATE XRC TARGET QP
Description:
Creates a XRC Target QP for the specified HCA.
A set of initial QP attributes must be specified by the Consumer.
On success, a handle to the newly created XRC QP and the XRC QP
number are returned.
Input Modifiers:
• HCA Handle
• XRC domain to be associated with this XRC TGT QP.
Output Modifiers:
• XRC TGT QP Handle
• XRC QP Number
Verb Results:
• Operation completed successfully.
• Insufficient resources to complete request.
• Invalid HCA Handle
• Invalid XRC Domain.
• HCA does not support XRC.
A14.5.2.3.2 MODIFY XRC TARGET QUEUE PAIR
Description:
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 36 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
Only a subset of the XRC Target QP attributes can be modified in each
of the QP states.
Table 9 XRC Target QP State Transition Properties
Transition Required Attributes Optional Attributes Actions
Reset to
Init
Enable/disable RDMA
a

and Atomic Operations.
P_Key index.
Primary Physical port.
None.
Init to Init None. Enable/disable RDMA
a

and Atomic Operations.
P_Key index.
Primary Physical port.
No transition.
Init to RTR Remote Node Address
Vector.
Loopback Indicator.
d
Destination QP Num-
ber.
RQ PSN.
Number of responder
resources for RDMA
Read/atomic ops.
Minimum RNR NAK
Timer Field
Alternate path address
information.
Enable/disable RDMA
a

and Atomic Operations.
P_Key index.
Activate receive processing.
RTR to RTS Local ACK Timeout
SQ PSN.
Enable/disable RDMA
a

& Atomic Operations.
Alternate path address
information.
Path migration state.
Current QP State.
Minimum RNR NAK
Timer Field.
Activate send processing.
RTS to RTS
(no transition)
None. Enable/Disable RDMA
and Atomic Opera-
tions.
a
Alternate path address
information.
Path Migration state.
Current QP State.
Minimum RNR NAK
Timer Field.
No transition.
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 37 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
RTS to SQD None. None. Deactivate send processing.
SQD to SQD None. Enable/Disable RDMA
& Atomic Operations.
a

Remote Node Address
Vector.
b

c
Loopback Indicator
d
Alternate path address
information.
Path migration state.
Number of local RDMA
Read/atomic responder
resources.
b
P_Key index.
b
Local ACK Timeout.
b
Primary physical port
associated with QP if
HCA supports the capa-
bility to change the pri-
mary physical port for a
QP when transitioning
from SQD to SQD
state.
b

Minimum RNR NAK
Timer Field.
Modify QP attributes
SQD to RTS None. Enable/Disable RDMA
and Atomic Opera-
tions.
a

Alternate path address
information.
Path migration state.
Current QP State.
Minimum RNR NAK
Timer Field.
Activate send processing.
Any State to Error None. None allowed. Queue processing is
stopped.
Work Requests pending or
in process are completed in
error, when possible.
Any state to Reset None. None allowed. QP attributes are reset to
the same values after the
QP was created.
a. If disable RDMA is requested while incoming RDMAs to that queue are in process, it is indeterminate when
the disable will take effect. It is up to the Consumer to coordinate the disable with the remote QPs.
b. It is allowed to change this attribute only when the SQ is drained.
Table 9 XRC Target QP State Transition Properties (Continued)
Transition Required Attributes Optional Attributes Actions
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 38 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
Input Modifiers:
• same as for RC QP except for:
• maximum number of outstanding WRs in the Send Queue
• maximum number of outstanding WRs in the Receive Queue
• Initiator Depth
• Retry Count
• RNR Retry Count
Output Modifiers:
• same as for RC QP (SQ/RQ related attributes are N/A).
A14.5.2.3.3 QUERY XRC TARGET QUEUE PAIR
Description:
Returns the attribute list and current values for the specified QP. This
QP handle can be any QP handle supplied by the Verbs.
Input Modifiers:
• HCA handle.
• QP handle.
Output Modifiers:
• same as for RC QP except for:
• Actual number of outstanding requests supported on the
Send Queue.
• Actual number of outstanding requests supported on the
Receive Queue.
• Initiator Depth
• Retry Count
• RNR Retry Count
• and with the addition of:
• XRC Domain
• Verb Results:
• Operation completed successfully.
• Invalid HCA handle.
• Invalid QP handle.
c. When changing the Remote Node Address Vector, the Path MTU cannot be changed in the SQD2SQD
transition.
d. Supported only if indicated in Query HCA. Destination LID and Loopback Indicator are mutually exclusive.
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 39 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.5.2.3.4 DESTROY XRC TARGET QUEUE PAIR
Description:
Destroys the specified QP.
C0-1: Incoming operations destined for a QP that has been destroyed
shall be discarded.
Input Modifiers:
• HCA handle.
• QP handle.
Output Modifiers:
• Verb Results:
• Operation completed successfully.
• Invalid HCA handle.
• Invalid QP handle.
A14.5.2.4 XRC INITIATOR QP
Description:
Create/Modify/Query/Destroy as for RC QP (Transport Service type is set
to XRC for Create QP).
The following input modifiers are ignored for XRC Initiator QPs:
• SRQ Handle
• Receive Queue CQ
• The maximum number of outstanding Work Requests the
Consumer expects to submit to the Receive Queue
• The maximum number of scatter/gather elements the Con-
sumer will specify in a Work Request submitted to the Re-
ceive Queue
• Enable or Disable incoming RDMA-R on this QP.
• Enable or Disable incoming RDMA-W on this QP.
• Enable or Disable incoming Atomic Ops on this QP.
• Responder Resources
• Minimum RNR NAK Timer Field
The following output modifiers are ignored for XRC Initiator QPs:
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 40 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
• Actual number of outstanding WRs supported on the receive
queue.
• Actual number of scatter/gather elements that can be speci-
fied in WRs submitted to the Receive Queue.
A14.5.3 WORK REQUEST PROCESSING
A14.5.3.1 QUEUE PAIR OPERATIONS
A14.5.3.1.1 POST SEND REQUEST
For XRC Initiator QPs, Post Send Request is the same as for RC QPs. A
new input modifier, remote XRC SRQ, is required for the operations listed
below:
• Send
• Send with Immediate
• Send with Invalidate
• RDMA Read
• RDMA Write
• RDMA Write with Immediate
• Atomics
Post Send Request is not supported by XRC Target QPs
A14.5.3.1.2 POST RECEIVE REQUEST
For XRC SRQs, Post Receive Request us the same as for SRQs
Post Receive Request is not supported by XRC Initiator QPs
Post Receive Request is not supported by XRC Target QPs
A14.5.3.2 COMPLETION QUEUE OPERATIONS
A14.5.3.2.1 POLL FOR COMPLETION
Completion for Send WRs posted to XRC Initiator QPs are same as for
WRs posted to regular QPs.
A new “XRC violation error” is returned for requests that caused the re-
sponder to return a “NAK-Invalid RD Request” NAK. This could have been
caused by either a Remote XRC Domain Violation or an XRCETH Viola-
tion as detailed in the transport section of this annex.
Completions for Receive WRs posted to XRC SRQs are same as for WRs
posted to regular SRQs with the addition of “Local XRC TGT QP Number”
to the list of output modifiers.
Note there are no completions associated with XRC TGT QPs as there
are no WRs ever posted to XRC TGT QPs.
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 41 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.5.4 RESULT TYPES
A14.5.4.1 IMMEDIATE RETURN RESULTS
This section contains a list of the possible Verb return results that are
added for XRC support.
• XRC not supported
• Invalid XRC Domain
• XRC Domain is in use
A14.5.4.2 COMPLETION RETURN STATUS
Describes the new possible Work Completion status error return results
for XRC
• XRC Violation Error - returned for requests that caused the respond-
er to experience an XRC Domain Violation (XRC SRQ is not in the
same domain as the XRC TGT QP).
A14.5.4.3 ASYNCHRONOUS EVENTS
A14.5.4.3.1 AFFILIATED ASYNCHRONOUS ERRORS
Describes the new Affiliated Asynchronous Errors for XRC TGT QPs
• XRC Domain Violation - Responder’s Receive Queue detected an
XRC Domain that does not match the XRC Domain of the XRCSRQ.
• Invalid XRCETH - Responder detected that the XRC SRQ does not
exist or is not in the right state or wire protocol violation.
A14.6 COMMUNICATION MANAGEMENT
A14.6.1 EXTENDED RELIABLE CONNECTED SERVICE
Extended Reliable Connected (XRC) is similar to RC, but is essentially
one-way with XRC Initiator QP being the requestor and XRC Target QP
the responder. Within the context of the Communication Management
protocol, the XRC Initiator QP is associated with the Active side and the
XRC Target QP is associated with the Passive side. This mapping allows
XRC to still use message reception (or the RTU) to cause the transition
from "REP Sent" to "Established" on the Passive side.
A14.6.2 COMMUNICATION MANAGEMENT MESSAGES
XRC connection establishment uses the same CM messages as RC con-
nection establishment except as noted below.
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 42 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.6.2.1 REQ
Use the same fields for RC connection establishment, except as modified
below.
Modifications to Table 99:
• Responder Resources field, Values column: 0 for XRC
• Transport Service Type field, Description column: See
Section 14.6.3.1 Transport Service Type.
• Retry Count field, Values column: 0 for XRC
• RNR Retry Count field, Values column: 0 for XRC
• SRQ field, Values column: 0 for XRC
• Change (reserved) field at position byte 51, bit 5 to:
• Field: Extended Transport Type
• Description: See Section 14.6.3.2 Extended Transport Type
• Used for Purpose: C
• Byte [Bit] Offset: 51 [5]
• Length, Bits: 3
A14.6.2.2 REP
Use the same fields for RC connection establishment, except as modified
below.
Modifications to Table 103:
• Initiator Depth field, Values column: 0 for XRC
• End-to-End Flow Control field, Values column: 0 for XRC
• SRQ field, Values column: 1 for XRC
A14.6.3 MESSAGE FIELD DETAILS
A14.6.3.1 TRANSPORT SERVICE TYPE
This Annex changes the encoding so that a value of 3 is no longer re-
served. Instead, the value of 3 in this field indicates a the use of an "Ex-
tended" transport type (see Section 14.6.3.2 Extended Transport Type).
InfiniBand
TM
Architecture Release 1.2.1Extended Reliable Connected (XRC) Transport Service February 23,2008
VOLUME 1 - GENERAL SPECIFICATIONS REV 1.0
InfiniBand
SM
Trade Association Page 43 Proprietary and Confidential
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
A14.6.3.2 EXTENDED TRANSPORT TYPE
This field is only used when the Transport Service Type field indicates use
of an "Extended" transport type. Otherwise, this field should be ignored.
The Extended Transport Type field specifies the desired service type.
The field is encoded as follows:
• 0: 0: Must be used when the Transport Service Type field is not
equal to 3
• 1: XRC
• 2-7: Reserved
A14.7 GENERAL SERVICES
A14.7.1 CLASSPORTINFO
Table 317 is modified as follows:
Bit: 14
Name: IsXRCCapable
Meaning: The CM associated with this port supports the establishment
and release of connections between XRC Initiator and Target QPs. It will
accept and respond to all mandatory messages as defined in
Section 14.6.1 Extended Reliable Connected Service.
Bit 15
Name: Reserved
Meaning: Reserved