Prototype selection for interpretable classification

DOI10.1214/11-AOAS495zbMATH Open1234.62096arXiv1202.5933WikidataQ102362768 ScholiaQ102362768MaRDI QIDQ765979FDOQ765979

Publication date: 22 March 2012

Published in: The Annals of Applied Statistics (Search for Journal in Brave)

Abstract: Prototype methods seek a minimal subset of samples that can serve as a distillation or condensed view of a data set. As the size of modern data sets grows, being able to present a domain specialist with a short list of "representative" samples chosen from the data set is of increasing interpretative value. While much recent statistical research has been focused on producing sparse-in-the-variables methods, this paper aims at achieving sparsity in the samples. We discuss a method for selecting prototypes in the classification setting (in which the samples fall into known discrete categories). Our method of focus is derived from three basic properties that we believe a good prototype set should satisfy. This intuition is translated into a set cover optimization problem, which we solve approximately using standard approaches. While prototype selection is usually viewed as purely a means toward building an efficient classifier, in this paper we emphasize the inherent value of having a set of prototypical elements. That said, by using the nearest-neighbor rule on the set of prototypes, we can of course discuss our method as a classifier as well.

Full work available at URL: https://arxiv.org/abs/1202.5933

Recommendations

zbMATH Keywords

nearest neighbors set cover integer program

Mathematics Subject Classification ID

Classification and discrimination; cluster analysis (statistical aspects) (62H30) Applications of statistics to biology and medical sciences; meta analysis (62P10) Applications of mathematical programming (90C90) Integer programming (90C10)

Cites Work

Cited In (15)

Uses Software

This page was built for publication: Prototype selection for interpretable classification

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q765979)