Generalized SIMD algorithm for efficient EM-PIC simulations on modern CPUs
POSTER
Abstract
There are several relevant plasma physics scenarios where highly nonlinear and kinetic processes dominate. Further understanding of these scenarios is generally explored through relativistic particle-in-cell codes such as OSIRIS [1], but this algorithm is computationally intensive, and efficient use high end parallel HPC systems, exploring all levels of parallelism available, is required. In particular, most modern CPUs include a single-instruction-multiple-data (SIMD) vector unit that can significantly speed up the calculations. In this work we present a generalized PIC-SIMD algorithm that is shown to work efficiently with different CPU (AMD, Intel, IBM) and vector unit types (2-8 way, single/double). Details on the algorithm will be given, including the vectorization strategy and memory access. We will also present performance results for the various hardware variants analyzed, focusing on floating point efficiency. Finally, we will discuss the applicability of this type of algorithm for EM-PIC simulations on GPGPU architectures [2]. \\[4pt] [1] R. A. Fonseca et al., LNCS 2331, 342, (2002)\\[0pt] [2] V. K. Decyk, T. V. Singh; Comput. Phys. Commun. 182, 641-648 (2011)