High-efficiency wavefunction updates for large scale Quantum Monte Carlo

ORAL

Abstract

Within ab intio Quantum Monte Carlo (QMC) simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunctions. The evaluation of each Monte Carlo move requires finding the determinant of a dense matrix, which is traditionally iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. For calculations with thousands of electrons, this operation dominates the execution profile. We propose a novel rank-$k$ delayed update scheme. This strategy enables probability evaluation for multiple successive Monte Carlo moves, with application of accepted moves to the matrices delayed until after a predetermined number of moves, $k$. Accepted events grouped in this manner are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency. This procedure does not change the underlying Monte Carlo sampling or the sampling efficiency. For large systems and algorithms such as diffusion Monte Carlo where the acceptance ratio is high, order of magnitude speedups can be obtained on both multi-core CPU and on GPUs, making this algorithm highly advantageous for current petascale and future exascale computations.

Authors

  • Paul Kent

    Oak Ridge National Laboratory, Oak Ridge National Laboratory, Oak Ridge, TN

  • Tyler McDaniel

    University of Tennessee, Knoxville, TN

  • Ying Wai Li

    Oak Ridge National Laboratory, Oak Ridge National Laboratory, Oak Ridge, TN

  • Ed D'Azevedo

    Oak Ridge National Laboratory, Oak Ridge, TN