Memory-efficient incremental kernel ridge regression with a smart chunking strategy

POSTER

Abstract

Kernel ridge regression (KRR) is one of the oldest machine-learning techniques that has been extensively used to model complex systems in both classification and regression problems. It is a very simple algorithm and has recently gained attention in the quantum chemistry community. Despite its extensive use, KRR has memory and scalability issues that limit its full utility in handling large datasets. Several attempts have been made to improve KRR performance including the recent incremental KRR based on iterative inverse of the kernel matrix. However, memory and scalability issues persist especially for big increments. To alleviate this problem, we have developed incremental chunked Cholesky empirical KRR (ICCE-KRR), a memory-efficient incremental KRR using incremental Cholesky factorization with data chunking. With the help of data chunking, we eliminate the problem of exceeding RAM or GPU capacity, thereby making our approach ‘crash-proof’. I will be presenting our preliminary results of using this algorithm in successfully predicting electron transfer couplings in dimer systems using large datasets. We expect the success of our method to inspire more ambitious undertakings in the quantum chemistry community in the future.

* We gratefully acknowledge support from Academia Sinica and the National Science and Technology Council of Taiwan through projects 112-2123-M-001-002 and 111-2123-M-001-003.

Publication: Manuscript under preparation

Presenters

  • Aaditya Manjanath

    National Institute for Materials Science

Authors

  • Aaditya Manjanath

    National Institute for Materials Science

  • Erickson Fajiculay

    Academia Sinica

  • Ryoji Sahara

    National Institute for Materials Science

  • Chao-Ping Hsu

    Academia Sinica