Variable Importance: A Neural Network-Based Variable Ranking Framework in the CMS Search for Four-Top Production
ORAL
Abstract
A variable ranking system utilizing dense neural networks trained for binary categorization is presented in the context of the search for four-top production in the single-lepton final state. The ranking framework, Variable Importance, defines the importance metric as the difference in the Area Under the Curve scores between two trained networks where one network has N inputs and the other has N-1 inputs. The quantity used for ranking is the significance, defined as the ratio between the mean and RMS of the importance metric distributions. Variable Importance is characterized by its large-scale computing approach and incorporates a method for accounting for correlation between variables. In addition to determining a ranking order, the Variable Importance framework includes steps for hyper parameter optimization and k-fold cross validation to obtain a final trained model. Tests on the predictive ability of the significance for model performance are shown as well as the resulting four-top, single-lepton final state simulation-based significance and limits derived from the discriminator of the final trained model.
–
Authors
-
Daniel Li
Brown University
-
Emanuele Usai
Brown University
-
Meenakshi Narain
Brown University
-
Ulrich Heintz
Brown University