Model-independent detection of new physics signals using interpretable semisupervised classifier tests
From MaRDI portal
Publication:6138599
Abstract: A central goal in experimental high energy physics is to detect new physics signals that are not explained by known physics. In this paper, we aim to search for new signals that appear as deviations from known Standard Model physics in high-dimensional particle physics data. To do this, we determine whether there is any statistically significant difference between the distribution of Standard Model background samples and the distribution of the experimental observations, which are a mixture of the background and a potential new signal. Traditionally, one also assumes access to a sample from a model for the hypothesized signal distribution. Here we instead investigate a model-independent method that does not make any assumptions about the signal and uses a semi-supervised classifier to detect the presence of the signal in the experimental data. We construct three test statistics using the classifier: an estimated likelihood ratio test (LRT) statistic, a test based on the area under the ROC curve (AUC), and a test based on the misclassification error (MCE). Additionally, we propose a method for estimating the signal strength parameter and explore active subspace methods to interpret the proposed semi-supervised classifier in order to understand the properties of the detected signal. We also propose a Score test statistic that can be used in the model-dependent setting. We investigate the performance of the methods on a simulated data set related to the search for the Higgs boson at the Large Hadron Collider at CERN. We demonstrate that the semi-supervised tests have power competitive with the classical supervised methods for a well-specified signal, but much higher power for an unexpected signal which might be entirely missed by the supervised tests.
Cites work
- scientific article; zbMATH DE number 708500 (Why is no real title available?)
- scientific article; zbMATH DE number 1104922 (Why is no real title available?)
- scientific article; zbMATH DE number 6791313 (Why is no real title available?)
- A Direct Approach to False Discovery Rates
- A course on point processes
- A multivariate two-sample test based on the number of nearest neighbor type coincidences
- Active subspace methods in theory and practice: applications to kriging surfaces
- Active subspace of neural networks: structural analysis and universal attacks
- Active subspaces. Emerging ideas for dimension reduction in parameter studies
- An introduction to generalized linear models
- Classification accuracy as a proxy for two-sample testing
- Distribution-free predictive inference for regression
- Exploiting active subspaces to quantify uncertainty in the numerical simulation of the hyshot II scramjet
- Global and local two-sample tests via regression
- Multivariate Two-Sample Tests Based on Nearest Neighbors
- Random forests
- The Elements of Statistical Learning
- The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family
- Variable importance in binary regression trees and forests
This page was built for publication: Model-independent detection of new physics signals using interpretable semisupervised classifier tests
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6138599)