Beyond AI: Using information theory to extract information from data without fitting
ORAL · Invited
Abstract
Standard Machine Learning/Artificial Intelligence (ML/AI) techniques extract information from data by fitting a probability distribution. These methods can be computationally expensive, and have an Achilles heel: they tend to fit the noise along with the information, leading to overfitting. Furthermore, these methods usually have to limit the number of features that are included for analysis, using ad hoc methods or by invoking investigator knowledge. I present an alternative to standard ML/AI methods that does not require fitting because it is able to extract the information in the data using tools from information theory. Because no fitting is involved, the new method only extracts the information, and as a consequence is mathematically superior to any method that cannot distinguish between information and noise. I show applications of this method to a diverse set of problems in biology, including predicting the function of biomolecules from sequence data, predicting future disease from microbiome, and predicting patient drug resistance from bulk transcriptomics. Because the method is universal, it should ultimately supplant all existing ML/AI approaches.
–
Publication: Jackson Kubal, Vincent R. Ragusa, Christoph Adami, Beyond AI: Predicting drug response from transcriptomics using information theory. Manuscript in preparation
Vincent R. Ragusa and C. Adami, Automatic Generation of Highly Functional Sequences. Manuscript in preparation
Presenters
-
Christoph Adami
- Michigan State University