You are on page 1of 20

FPGA Co-processor for the ALICE High Level Trigger

Gaute Grastveit University of Bergen Norway
H.Helstrup1, J.Lien1, V.Lindenstruth2, C.Loizides5, D.Roehrich3, B.Skaali4, T.Steinbeck2, K.Ullaland3, A.Vestbo3, T. Vik4, A. Wiebalck2 for the ALICE Collaboration College, Norway Institute for Physics, University of Heidelberg, Germany 3Departement of Physics, University of Bergen, Norway 4Departement of Physics, University of Oslo, Norway of Nuclear Physics, University of Frankfurt, Germany
1Bergen

2Kirchhoff

5Institute

ALICE

– A Large Ion Collider Experiment

TPC
- Time Projection Chamber

lossless methods: factor 2 .25 Gbyte/s Compression/selection is needed Conventional.Very High Data Rate Pb-Pb central collisions Event rate: 200Hz Event size: ~75Mb => 15 Gbyte/s Max data-rate to tape is 1.

Task: reconstruct the tracks of 20...000 charged particles (each producing 150 clusters) in the TPC Timebudget: 5 ms .HLT functionality • Compress • Reduce the amount of data required to encode the event as far as possible without loosing physics information • Trigger • Accept/reject events on the basis of physics application • Select • Select regions of interest within an event • remove pile-up in p-p • .

The HLT setup Data are received in parallel 216x320 MB/s 216x100 MB/s RORC DDL reveiver Buffer > 1000 Events RcvBd NIC PCI RCU – Readout Controller Unit DDL – Data Detector Link RORC – ReadOut Reciver Card RCU ALTRO TPC FEE Buffer (8 Events) DDL RORC reveiver Buffer > 1000 Events HLT farm PCI RcvBd NIC •PCI kernel in the FPGA •FPGA will also be utilised for pattern recognition •Reduces number of CPU’s needed .

The HLT FPGA co-processor • FPGA: APEX 20K400 • Next prototype: Altera Stratix FPGA – Large internal memory – DSP cores .

Pb-Pb outer padrows) •Conventional approach with (2d) cluster finder and track follower •High occupancy (overlapping clusters): •Hough transform on raw data •Cluster analysis for deconvolution •(Kalman filter) High multiplicity picture .Two Schemes for Finding Tracks •Low occupancy (p-p.

Cluster Finder .

The square shows the geometric centre of the sequence. Final mean value: (Weighted mean)  charge scalevalue  charge Pad .time The numbers represent Charge (ADC values) A vertical uninterrupted stack of numbers is called a sequence. Neighbouring sequences belong to the same Cluster.

the algorithm • Calculate the mean for every sequence • Adjacent pads with similar means are merged • Two lists of sequences are used: one for clusters on the previous pad one for clusters on the current pad • Clusters are removed from the searchrange when a match is found or we know it is finished • Clusters are inserted in the inputrange after merging or when we start a new cluster Memory of clusters begin Searchrange / Previous pad end Inputrange / Current pad insert .FPGA implementation of a cluster finder .

Verification Testbench Top structure RAM (lpm) T Decoder seq FIFO (lpm) seq Merger cluster File: charges File: VHDL clusters C++ model File: C++ clusters C++ program compares the results .Block Diagram.

Relative Scales As before the mean is calculated by:  charge scalevalue  charge smaller + Smaller numbers. (absolute): Decoder FIFO (lpm) Pre_Calc (2 mult.Multiplication can’t be done until merging takes place Alternative. 1 add) Merger . only multiplies by <11 .

Deconvolution Simplified implementation. almost for free – splits at minima in both directions (time and pad) off on .

Merger Goals •spend few clock cycles per sequence •use few logic elements Clock cycles spent in the different states 6% •high clockspeed & 30 % 22 % new data & next pad new row or skip pad send many 5% new search range 4% 0% 11 % 11 % merge store W idle send all & 11 % idle .30% merge_mult & empty merge add ++ insert seq W send one merge_add merge_store send_all send_many send_one calc dist -- old is above old is below merge mult **+ calc_dist insert_seq within match distance .

Cluster Finder Performance •Syntesized on Altera APEX •Uses 1800 Logic Elements •Circuit runs at 33Mhz (11%) (4%) •Memory usage 16*80 + 64*112= 8448 bits .

Time) Local coordinates (X. .eta-index) Detector Data Link Detector Data Link Data Format Data Format Decoder Decoder XYZ XYZ Transformer Transformer ABE ABE Transformer Transformer Histogram 1 Histogram 1 Histogram 2 ADC count 10-to-8 Bit 10-to-8 Bit Converter Converter .phi.. Y. Z) (A.Outlook Implementation of Hough transformation Back Linked List (ALTRO sequences) TPC coordinates (Padrow.. . Histogram N-1 Histogram N-1 Find Find Maxima Maxima Histogram N Histogram N ..E) Parameter Space (k.B. Pad.

Conclusion We have demonstrated the feasibility of a real time cluster finder implemented in an FPGA Firmware implementation of a Hough transform looks promising .

transperacy replacements from now on .

ALICE – A Large Ion Collider Experiment .

570.000 pads .TPC .Time Projection Chamber 18 sectors on each side. each sector is readout in 6 subsectors Total is ca.

9438 920.9 232.0.3-49/70.24891471700 8598.3/5./ 411 43 .943 .43.4:943 $2510/2502039.

/ 307447 855. 803/*2.// 2070*89470 803/*. /89  .08507806:03.30    2070 89470  /0 803/ .8 W8503/10.3.7.0 ../89.../ 803/ 2.9.0 4/8-04 2070 2:9  932.0 4..0020398 W.3  3080.3 803/*430 ..-4.8500/     30/.908 W:80104.4.   /0   2070*2:9 0259 2070 .*/89 38079*806 4/8.4.7. 3095...9.//  38079 806  803/ 430 2070*.0885039390/110703989..07074.

9 .0 W$39080/43907.:89073/07!071472.0    -98 W7.:97:38.3.! W&808 4.020398   W0247:8.

9 .38147207   %7..3 090.38147207 %7.38147207 %7.908 %# 806:03.9.472.38147207 8947.9 0. 8947.943414:97.472.4:39  94 9  94 9 43.943 . ..07907 43.7.9./ %20          !..:944 2502039.30/89%!.07907 8947.2 8947.447/3.2 8947.0  5 09.2 8947.3 .2.447/3.4/07 0. 3/0 090.20907 $5.947..2.381472.947.2 .2  3/ 3/ .908 4.2  8947./74 !.4/07   %7.08  !.9.2 .9.

43.381472448574283 .4: 97.70.:890713/0725020390/ 3.3! 72.702502039.0/0243897.90/9010.920.8-9 41.94341.:843 0..

.97.38507.705..0203981742 3443 .

70434/075072039 . .

947843 0.9478 70..   5..2-07 80./4:93 8:-80.8/0 0.8.%! %20!740.. 80.9478 %49.943./8 .