

# CMS Level-1 Track Trigger Overview & KIT contributions

Oliver Sander, Marc Weber, Christian Amstutz, Luis Ardila, Matthias Balzer, Timo Dritschler, Tanja Harbaum, Armin Herth, Hannes Mohr, Benjamin Oldenburg, Thomas Schuh, Denis Tcherniakhovski

Institut für Prozessdatenverarbeitung und Elektronik





www.kit.edu





### LHC / HL-LHC Plan LHC **HL-LHC** Run 1 Run 2 Run 3 EYETS 13.5-14 TeV LS1 LS2 14 TeV LS3 14 TeV 13 TeV energy injector upgrade cryo Point 4 DS collimation 5 to 7 x splice consolidation nominal cryolimit **HL-LHC** 8 TeV button collimators 7 TeV interaction luminosity installation R2E project P2-P7(11 T dip.) regions Civil Eng. P1-P5 2017 2022 2024 2025 2013 2014 2015 2018 2019 2020 2021 2023 2026 2037 radiation damage experiment 2 x nominal luminosity experiment upgrade experiment upgrade phase 2 75% nominal luminosity beam pipes phase 1 nominal luminosity integrated luminosity 30 fb<sup>-1</sup> 150 fb<sup>-1</sup> 300 fb<sup>-1</sup> 3000 fb<sup>-1</sup>

- High Luminosity operation starts in 2026
- 5-7 times higher luminosity
- 40 MHz collision rate
- Up to 200 simultaneous pp-collisions in one single event

## **LHC Event Display**





### Higgs Candidate 2012

## **Possible High Luminosity LHC Event Display**





### Recorded Event: "high pile-up fill", 2016 <Pile-Up> only 100

4 12.04.17 Oliver Sander - CMS Level-1 Track Trigger Overview & KIT activities

## **Current CMS Trigger System**





CMS uses two level trigger system

### Level 1:

- Customized programmable electronics
- Full 40 MHz event rate
- Buffer for raw data limits trigger time to 3.2 us
- Only calorimeter and muon data

### High Level Trigger:

- Runs on commercial computer farm
- Track reconstruction solely on demand and in selected regions of interest

## **Upgraded CMS Trigger System**



Simple Evaluation



### NEW: Outer tracker data is used in L1 Trigger

- ~100 Tb/s incoming data rata
- Data reduction by a factor of 50000

## New outer tracker for phase II



- Complete replacement of outer tracker
- 13 296 modules, 192m<sup>2</sup>, 42M strips, 170M macro-pixels (before 10 M)
- 6 layers in the barrel and 3x5 end caps



- Two types of double layer sensor modules:
  - Strip-strip (2S): 5 cm x 90 um
  - Pixel-strip (PS): 1.6 mm / 2.5 cm x 100 um
- Designed for reconstruction > 2-3 GeV



## **First Level of Data Reduction in the Detector**



- Two layered architecture allows to estimate pT
- Cut at 2-3 GeV allows for data reduction by one order of magnitude
- Output data rate
  - ~ 20 k Stubs @ 40 MHz  $\rightarrow$  50 Tb/s

## **CMS Track Trigger Approaches**



chnology

|                | Associative Memory<br>(AM) Approach | Time-Multiplexing Track<br>Trigger (TMTT) Approach | Track Construction of Technol<br>Karlsruhe Institute of Technol |
|----------------|-------------------------------------|----------------------------------------------------|-----------------------------------------------------------------|
| Track Finding  | Associative Memory                  | Hough-Transform                                    | Tracklet                                                        |
| Track Fitting  | PCA based linearized fit            | Kalman Filter                                      | Linearized xi <sup>2</sup> fit                                  |
| Sectors        | 48                                  | 8                                                  | 18                                                              |
| Time Multiplex | 20x                                 | 36x                                                | 6x                                                              |
| Hardware       | FPGA+ASIC                           | FPGA                                               | FPGA                                                            |
|                |                                     |                                                    |                                                                 |

**Involved Groups** 









Comparison of all approaches was in December 2016

## **Time-Multiplexing Track Trigger Approach**







## **Time-Multiplexing Track Trigger Approach**



SOURCE



## **Hough Transformation to find Track Candidates**

- search for primary tracks in the  $r-\phi$  plane
- infinite amount of different circles ( $\phi_0$ , R) possible between origin and single measured stub position (r,  $\phi$ )



but track parameters are correlated

$$\phi_{58} \approx \phi + \frac{q}{p_{\rm T}} \times r_{58}$$

( $\phi_{58}$  and  $r_{58}$  are slightly transformed variables)

### stub positions corresponds to straight lines in the track parameter plane



## Hough Transformation to find Track Candidates



- (1) for each stub calculate  $\phi_{58}$  for each  $q/p_{
  m T}$
- 2 fill the stub into corresponding cells of a  $32 \times 32$  track parameter array
  - ignore  $q/p_{\rm T}$  values which are inconsistent with the  $p_{\rm T}$  estimate of the stub
- 3 define cells with stubs from at least 5 different layers as track candidates

## **Architecture for Hough Transformation**



- array is implemented as a pipeline, it processes one stub per clock cycle (240 MHz)
- first step is the filling of the array
- second step is the readout of track candidates



- Book Keeper unpacks stub data from 2 input links, which then propagate to each of the 32 q/pT bins in turn
- track candidates found by the bins propagate back to the Book Keeper, which transmits them over two links





## **Implementation Results of HT**



- High tracking efficiency
- Very good agreement between SW simulation and HW implementation
- FIFO Latency of the HT is constant at 1.025 μs
- Data reduction by one order of magnitude

## Kalman Filter and Duplicate removal



### Kalman Filter



- KF fits track parameters and removes incorrect stubs
- CMS already uses KF in offline reconstruction and in HLT
- In L1 Trigger only possible through massive data reduction and candidate building in the Hough transformation step

Duplicate Removal



- DR removes candidates where KF parameters do not fit to HT parameters
- Simple compared to conventional algorithms (comparison of track pairs)
- Simplicity achieved trough deep understanding of how HT tracks are formed

## **Proposed System for AM-based Track Finding**





Institut für Prozessdatenverarbeitung und Elektronik

## **Overview Processing in AM Approach**



**AM: Pattern Recognition Mezzanine** = TMTT: Track Finding Processor



## Associative memory template matching

- We know how interesting tracks with pT > 2 GeV/c look like...
- Store corresponding patterns in AM.
- Compare patterns with data hits or stubs in "one" clock cycle
- Hits associated with a pattern are input to track fitting algorithms





## The AM Chip



| Vers.  | Design                                    | Tech.  | Area                       | Patterns | Package     |
|--------|-------------------------------------------|--------|----------------------------|----------|-------------|
| 1      | Full custom                               | 700 nm |                            | 128      | QFP         |
| 2      | FPGA                                      | 350 nm |                            | 128      | QFP         |
| 3      | Std cells                                 | 180 nm | 100 mm <sup>2</sup>        | 5 k      | QFP         |
| 4      | Std cells +<br>Full custom                | 65 nm  | 14 mm <sup>2</sup>         | 8 k      | QFP         |
| mini-5 | Std cells +<br>Full custom                | 65 nm  | 4 mm <sup>2</sup>          | 0,5 k    | QFP         |
| 5      | + IP blocks                               |        | 12 mm <sup>2</sup>         | 3 k      | BGA         |
| 6      | Std cells +<br>Full custom<br>+ IP blocks | 65 nm  | <b>168 mm</b> <sup>2</sup> | 128 k    | BGA         |
| 7      | Std cells +<br>Full custom                | 28 nm  | 10 mm <sup>2</sup>         | 16 k     | BGA,<br>SiP |







## **KIT Contributions to PRM Development**



- Manufacturing and test of PRM05
- Design contributions to PRM06
- Comissioning of PRM06
- Development of firmware components
  - Memory interfaces
  - Configuration infrastructure
- Power and Temperature measurements

### PRM05



## • 6 PRMs (3@16AM05 + 3@4AM05) available and tested

- FPGA: Kintex7
- 16 AM05: total of 32 kpatterns
- GTX maximum speed: 8 Gpbs
- Single LDRAM

### PRM06

- 3 PRMs available and tested
- Ready to produce additional PCBs
- FPGA: Kintex Ultrascale
- 12 AM06: total of 1.5 Mpatterns
- GTH maximum speed: 12.5 Gpbs
- Double RLD3RAM 1 Meg x 36 x 16 Banks, 1066 MHz DDR operation
- Flash memory





# Case Study 1: System Simulation of the CMS L1 Track Trigger





- Evaluation of different system architectures
- Evaluation of system parameters, e.g. latencies, bandwidths
- Usage of input data from physics simulations

## **Case Study 2: Hough Transformation on GPUs**



- Why not use GPUs for this number crunching task?
- Minimum latency approach
  - Parallel computation for each p<sub>t</sub>-bin
  - Computational redundancy, but minimal latency
  - Load balancing challenging
  - Synchronization between pt-bins challenging
  - Runtimes and transfer times very stable
  - Next steps:

23

- Include fitting
- Maximize throughput

### DMA MEASUREMENT SETUP







Institut für Prozessdatenverarbeitung und Elektronik

## **Case Study 3: Implementing AM in FPGA logic**





Layer 0

- ~300 miscellaneous values (sensors)
- >20000 matches per input possible

### Layer 5

- >4200 miscellaneous values
- max. 1230 matches per input

Minimization by logic optimization techniques

## **Case Study 3: Implementing AM in FPGA logic**





## Summary

- Building a L1 Track Trigger is a highly challenging task
  - Low Latency: < 4 us for complete track reconstruction</p>
  - High data rates: ~ 50 Tb/s
- Three different approaches are exploited
  - TMTT: Hough Transformation and Kalman Filter ]
  - AM: ASIC and xi<sup>2</sup> fit
  - Tracklet: Tracklet based and xi<sup>2</sup> fit
- All three approaches are able to fulfill the tight timing and performance requirements (Dec/16 review)
- There are some decisions to be made i.e. what becomes baseline
- Next steps
  - Harmonize hardware and interfaces among the approaches
  - Three teams shall converge to one single team



## **Karlsruhe Institute of Technology**



### Thank you for your attention.



Dr.-Ing. Oliver Sander Institut für Prozessdatenverarbeitung und Elektronik sander@kit.edu

