You are on page 1of 12

A T6P||P transport |ayer for the 0A0 of the A T6P||P transport |ayer for the 0A0 of the

6H8 Exper|ment 6H8 Exper|ment


H|k|os Koz|ovszky H|k|os Koz|ovszky
for the 6H8 Tr|0A8 co||aborat|on for the 6H8 Tr|0A8 co||aborat|on
6ERN 6ERN
European 0rgan|zat|on for Nuc|ear Research European 0rgan|zat|on for Nuc|ear Research
A6AT03 A6AT03 -- 0ecember 2003 0ecember 2003
6H8 & 0ata Acqu|s|t|on 6H8 & 0ata Acqu|s|t|on
Co|||s|or rale 10 Vlz
Leve|-1 Vax|rur lr|gger rale 100 |lz
Average everl s|ze - 1 Voyle
No. ol lr-0ul ur|ls 1000
Readoul relWor| oardW|dlr - 1 Terao|l/s
Everl l||ler corpul|rg poWer - 5 10

VlP3
0ala producl|or - Toyle/day
6H8
0etector Frontend
6omput|ng 8erv|ces
Readout
8ystems
F||ter
8ystems
Event
Hanager
u||der Networks
Leve| 1
Tr|gger
Run
6ontro|
Data Data
Event buiIder :
Physical system interconnecting data sources with data destinations. t has to
move each event data fragments into a same destination
Event fragments
:
Event data
fragments are
stored in separated
physical memory
systems
FuII events :
Full event data are
stored into one physical
memory system
associated to a
processing unit
1
2
3 3
512
1 1 2 2 512 512
3
512 Data sources for 1 MByte events
~1000s HTL processing nodes
NxM EVB
u||d|ng the events u||d|ng the events
Distributed DAQ Iramework
developed within CMS.
Construct homogeneous applications
Ior heterogeneous processing clusters.
Multi-threaded (important to take
advantage oI SMP eIIiciently).
Zero copy message passing Ior the
event data.
Peer to peer communication between
the applications.
I
2
O Ior data transport, and SOAP Ior
conIiguration and control.
Hardware and transport independency.
$ and Device Drivers
HTTP
Ethernet Myrinet
XDAQ
Util/DDM
Processing
$ensor readout
TCP
PC
Subject
oI presentation
0A0 Framework 0A0 Framework
Reuse old, 'cheap Ethernet Ior DAQ
Transport layer requirements
Reliable communication
Hide the complexity oI TCP
EIIicient implementation
Simplex communication via sockets
ConIigurable
Support oI blocking and non-blocking I/O
T6P||P Peer Transport Requ|rements T6P||P Peer Transport Requ|rements
Pending Queues
Thread saIe PQ management
One PQ Ior each destination
Independent sending through sockets
Only one 'Select Iunction call both to
receive the packet and send the blocked
data.
|mp|ementat|on of the non |mp|ementat|on of the non--b|ock|ng mode b|ock|ng mode
1 2 3 4 5 n
1 2 3 4 5 n
#2
Pending Queues
XDAQ Application
Framesen
d
1 2 3 4 5 n
#n
Select
Receiver Object(s)
OS
XDAQ Executive
Peer
Transport
Layer
ptATCP
Applications (XDAQ)
ptATCPPort(s)
XDAQ
Framework
Sender Object(s)
Input SAP(s) Output SAP(s)
Driver(s)
NIC (10GE) NIC (FE) NIC (GE)
Creation oI object
Sending
Receiving
other communication
6ommun|cat|on v|a the transport |ayer 6ommun|cat|on v|a the transport |ayer
Throughput opt|m|sat|on Throughput opt|m|sat|on
Single rail Multi-rail
App 1
App 2 App 2
App 1
Operating System tuning (kernel optionsbuIIers)
Jumbo Frames
Transport protocol options
Communication techniques
Blocking vs. Non-Blocking I/O
Single/Multi-rail
Single/Multi-thread
TCP options (e.g.:Nagle algorithm)
..
Test network Test network
Cluster size: 8x8
CPU: 2x Intel Xeon (2.4 GHz), 512KB Cache
I/O system: PCI-X: 4 buses (max 6) .
Memory: Two-way interleaved DDR: 3.2 GB/s (512 MB)
NICs: 1 Intel 82540EM GE
1 Broadcom NeXtreme BCM 5703x GE
1 Intel Pro 2546EB GE (2port)
OS: Linux RedHat 2.4.18-27.7 (SMP)
Switches: 1 BATM- T6 Multi Layer Gigabit Switch (medium range)
2 Dell Power Connect 5224 (medium range)
0
20
40
60
80
100
120
140
100 1000 10000 100000
Fragment Size (Byte)
T
h
r
o
u
g
h
p
u
t

p
e
r

N
o
d
e

(
M
B
/
s
)
Iink BW (1Gbps)
8x8 EVB [P4 e1000 Powerconnect 5224]
32x32 EVB [P3 AceNIC FastIron8000]
Conditions:
XDAQEvent Builder
o Readout Unit inputs
o Builder Unit outputs
o Event Manager
PC: dual P4 Xeon
Linux 2.4.19
NIC: e-1000
Switch: Powerconnect 5224
Standard MTU (1500 Bytes)
Each BU builds 128 events
Fixed Iragment sizes
#0sult:
For Iragment size ~ 4 kB:
Thru /node ~100 MB/s
i.e. 80 utilisation
orking point
Event u||d|ng on the c|uster Event u||d|ng on the c|uster
Two Ra|| Event u||der measurements Two Ra|| Event u||der measurements
%0st cas0
Bare Event Builder (2x2)
W o RU inputs
W o BU outputs
W o Event Manager
5tions
Non blocking TCP
Jumbo Irames (mtu 8000)
Two rail
One thread
#& working 5oint (16 kB)
Throughput/node 240 MB/ s
i.e. 95 bandwidth
W Achieved 100 MB/s per node in 8x8 configuration (1rail).
W mprovements seen with the use of two rail, non-blocking
/, with Jumbo frames. n 2x2 configuration over 230
MB/s obtained.
W High CPU load.
W e are also studying other networking and traffic shaping
options.
6onc|us|ons 6onc|us|ons

You might also like