Memory-efficient incremental kernel ridge regression with a smart chunking strategy
POSTER
Abstract
Kernel ridge regression (KRR) is one of the oldest machine-learning techniques that has been extensively used to model complex systems in both classification and regression problems. It is a very simple algorithm and has recently gained attention in the quantum chemistry community. Despite its extensive use, KRR has memory and scalability issues that limit its full utility in handling large datasets. Several attempts have been made to improve KRR performance including the recent incremental KRR based on iterative inverse of the kernel matrix. However, memory and scalability issues persist especially for big increments. To alleviate this problem, we have developed incremental chunked Cholesky empirical KRR (ICCE-KRR), a memory-efficient incremental KRR using incremental Cholesky factorization with data chunking. With the help of data chunking, we eliminate the problem of exceeding RAM or GPU capacity, thereby making our approach ‘crash-proof’. I will be presenting our preliminary results of using this algorithm in successfully predicting electron transfer couplings in dimer systems using large datasets. We expect the success of our method to inspire more ambitious undertakings in the quantum chemistry community in the future.
* We gratefully acknowledge support from Academia Sinica and the National Science and Technology Council of Taiwan through projects 112-2123-M-001-002 and 111-2123-M-001-003.
Publication: Manuscript under preparation
Presenters
-
Aaditya Manjanath
National Institute for Materials Science
Authors
-
Aaditya Manjanath
National Institute for Materials Science
-
Erickson Fajiculay
Academia Sinica
-
Ryoji Sahara
National Institute for Materials Science
-
Chao-Ping Hsu
Academia Sinica