Accelerating quantum Monte Carlo simulations on many-core processors via OpenMP nested threading

ORAL

Abstract


The massively parallel nature of quantum Monte Carlo is ideally suited to petascale computers which have enabled a wide range of applications to relatively large molecular and extended systems. The current scheme to achieve the shortest time-to-solution is placing one walker per core on multi/many-core processors. However, this strategy meets a great challenge from upcoming Exascale computers where much larger problem will be solved. The time to advance a walker and the memory need of it scale cubically and quadratically as a function of the electron counts in the system. On current computers, large problems already force us to leave cores idle for fitting them in memory. In addition, adding more walkers reduces the production steps for a given amount of samples but do not reduce the equilibration steps, which causes even more waste. In order to reduce the time-to-solution and reduce the memory footprint, we introduce nested threading to distribute the computation of each walker over several cores. We explore threading algorithms with minimal overhead and demonstrate a good scaling.

Presenters

  • Ye Luo

    Argonne National Laboratory

Authors

  • Ye Luo

    Argonne National Laboratory