Protein Identification Using a Single-Layer MoS2 Nanopore: Towards Machine Learning-Based Predictive Models

ORAL

Abstract

Protein identification can enable breakthrough advances in early diagnosis of diseases and the health status of the humans. Nanopore sequencing can be used as a label-free, single base and fast reading platform to identify amino acids of a protein. The current challenge with the nanopore technology is the noise in ionic current measurements. Here, we show that with a nanopore drilled in a single-layer molybdenum disulfide (MoS2), we can detect each single amino acid in a polypeptide chain with high distinguishability. Using extensive molecular dynamics (MD) simulations (with a total aggregate simulation time of 65 µs) and machine learning (ML) techniques, we characterize and cluster the ionic current and residence time of the 20 human amino acids. Using the split test training, logistic regression and nearest neighbor classifiers, the sensor read is predicted with an accuracy of up to 99.6%. In addition, using ML classification techniques, over 2.5 million hypothetical sensor reads’ amino acid types are predicted.

Presenters

  • Mohammad Heiranian

    Univ of Illinois - Urbana

Authors

  • Mohammad Heiranian

    Univ of Illinois - Urbana

  • Amir Barati Farimani

    Univ of Illinois - Urbana, Chemistry, Stanford University

  • N. Aluru

    Mechanical Science and Engineering, University of Illinois at Urbana Champaign, Univ of Illinois - Urbana, University of Illinois at Urbana–Champaign