Predicting Microbial Community Compositions and Oil Contamination in Water Samples Using Neural Networks and Generative Models

ORAL

Abstract

Understanding microbial communities and their response to oil contamination is vital for effective environmental monitoring and biodegradation. However, modeling high-dimensional biological datasets is often challenging due to limited experimental data. To overcome this issue, we developed a prediction model for microbial compositions and oil contamination in water samples using artificial neural network algorithms. Our approach integrates dimensionality reduction, a noise injection algorithm, and a variational autoencoder (VAE) to handle high-dimensional, non-linear, and sparse data. We demonstrate that dimensionality reduction based on feature importance from decision trees enhances model training performance. Additionally, we employ a noise injection method to generate synthetic data, which improves VAE training by learning the underlying data distribution. This straightforward combination of standard neural networks significantly enhances training performance and predictive power, achieving an R² of up to 0.99.

*This work was supported by the Henes Center for Quantum Phenomena at Michigan Technological University. Special thanks to Prof. Ravindra Pandey, Department of Physics, Michigan Technological University, for the support.

Presenters

  • Tong Gao

    • Department of Physics, Michigan Technological University
    • Michigan Technological University

Authors

  • Tong Gao

    • Department of Physics, Michigan Technological University
    • Michigan Technological University
  • Isaac Bigcraft

    • Department of Biological Sciences, Michigan Technological University
  • Stephen Techtmann

    • Department of Biological Sciences, Michigan Technological University
  • Issei Nakamura

    • Michigan Technological University
    • Department of Physics, Michigan Technological University