Protein Identification Using a Single-Layer MoS2 Nanopore: Towards Machine Learning-Based Predictive Models
ORAL
Abstract
Protein identification can enable breakthrough advances in early diagnosis of diseases and the health status of the humans. Nanopore sequencing can be used as a label-free, single base and fast reading platform to identify amino acids of a protein. The current challenge with the nanopore technology is the noise in ionic current measurements. Here, we show that with a nanopore drilled in a single-layer molybdenum disulfide (MoS2), we can detect each single amino acid in a polypeptide chain with high distinguishability. Using extensive molecular dynamics (MD) simulations (with a total aggregate simulation time of 65 µs) and machine learning (ML) techniques, we characterize and cluster the ionic current and residence time of the 20 human amino acids. Using the split test training, logistic regression and nearest neighbor classifiers, the sensor read is predicted with an accuracy of up to 99.6%. In addition, using ML classification techniques, over 2.5 million hypothetical sensor reads’ amino acid types are predicted.
–
Presenters
-
Mohammad Heiranian
Univ of Illinois - Urbana
Authors
-
Mohammad Heiranian
Univ of Illinois - Urbana
-
Amir Barati Farimani
Univ of Illinois - Urbana, Chemistry, Stanford University
-
N. Aluru
Mechanical Science and Engineering, University of Illinois at Urbana Champaign, Univ of Illinois - Urbana, University of Illinois at Urbana–Champaign