Concentration inequalities for two-sample rank processes with application to bipartite ranking

DOI10.1214/21-EJS1907MaRDI QIDQ2233587zbMATH OpenOpenAlexFDO

Authors Stephan Clémençon, Myrto Limnios, Nicolas Vayatis

Publication date 11 October 2021

Published in Electronic Journal of Statistics (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/2104.02943, https://projecteuclid.org/journals/electronic-journal-of-statistics/volume-15/issue-2/Concentration-inequalities-for-two-sample-rank-processes-with-application-to/10.1214/21-EJS1907.full

zbMATH Keywords

concentration inequalities empirical risk minimization generalization bounds statistical learning theory bipartite ranking two-sample linear rank statistics rank process

Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05) Inequalities; stochastic orderings (60E15) Empirical decision procedures; empirical Bayes procedures (62C12) Nonparametric inference (62G99)

Abstract: The ROC curve is the gold standard for measuring the performance of a test/scoring statistic regarding its capacity to discriminate between two statistical populations in a wide variety of applications, ranging from anomaly detection in signal processing to information retrieval, through medical diagnosis. Most practical performance measures used in scoring/ranking applications such as the AUC, the local AUC, the p-norm push, the DCG and others, can be viewed as summaries of the ROC curve. In this paper, the fact that most of these empirical criteria can be expressed as two-sample linear rank statistics is highlighted and concentration inequalities for collections of such random variables, referred to as two-sample rank processes here, are proved, when indexed by VC classes of scoring functions. Based on these nonasymptotic bounds, the generalization capacity of empirical maximizers of a wide class of ranking performance criteria is next investigated from a theoretical perspective. It is also supported by empirical evidence through convincing numerical experiments.

Recommendations

Cites work

Cited in

(4)

This page was built for publication: Concentration inequalities for two-sample rank processes with application to bipartite ranking

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2233587)