Graph Neural Network-based Track finding as a Service with ACTS
ORAL
Abstract
Recent progress in track finding for the High-Luminosity Large Hadron Collider (HL-LHC) has demonstrated the effectiveness of Graph Neural Network (GNN)-based algorithms. While these algorithms offer high efficiency and reasonable resolutions, their computational demands on CPUs hinder real-time processing, requiring accelerators like GPUs. However, the large size of the involved graphs poses a challenge for facilities lacking high-end GPUs.
To address this, we propose deploying the GNN-based track-finding algorithm as a service in the cloud or high-performance computing centers such as the NERSC Perlmutter system with over 7000 A100 GPUs. We have implemented a tracking-as-a-service prototype within A Common Tracking Software (ACTS), a toolkit for charged particle track reconstruction.
This approach is algorithm-agnostic, allowing the incorporation of various algorithms as new backends through interactions with the client interface in ACTS. In this contribution, we showcase the versatility of the as-a-service approach by implementing the GNN-based track-finding workflow using the Nvidia Triton Inference Server within ACTS. We assess track-finding throughput and GPU utilization, exploring the scalability of the inference server across the NERSC Perlmutter supercomputer and cloud resources.
To address this, we propose deploying the GNN-based track-finding algorithm as a service in the cloud or high-performance computing centers such as the NERSC Perlmutter system with over 7000 A100 GPUs. We have implemented a tracking-as-a-service prototype within A Common Tracking Software (ACTS), a toolkit for charged particle track reconstruction.
This approach is algorithm-agnostic, allowing the incorporation of various algorithms as new backends through interactions with the client interface in ACTS. In this contribution, we showcase the versatility of the as-a-service approach by implementing the GNN-based track-finding workflow using the Nvidia Triton Inference Server within ACTS. We assess track-finding throughput and GPU utilization, exploring the scalability of the inference server across the NERSC Perlmutter supercomputer and cloud resources.
*This research used the National Energy Research Scientific Computing Center (NERSC) resources, a U.S. Department of Energy Office of Science User Facility located at Lawrence Berkeley National Laboratory, operated under Contract No. DE-AC02-05CH11231 using NERSC award ERCAP0021226 and is supported by NSF award No. 2117997
–
Presenters
-
Haoran Zhao
- University of Washington