Abstract: In data sequences measured over space or time, an important problem is accurate detection of abrupt changes. In partially labeled data, it is important to correctly predict presence/absence of changes in positive/negative labeled regions, in both the train and test sets. One existing dynamic programming algorithm is designed for prediction in unlabeled test regions (and ignores the labels in the train set); another is for accurate fitting of train labels (but does not predict changepoints in unlabeled test regions). We resolve these issues by proposing a new optimal changepoint detection model that is guaranteed to fit the labels in the train data, and can also provide predictions of unlabeled changepoints in test data. We propose a new dynamic programming algorithm, Labeled Optimal Partitioning (LOPART), and we provide a formal proof that it solves the resulting non-convex optimization problem. We provide theoretical and empirical analysis of the time complexity of our algorithm, in terms of the number of labels and the size of the data sequence to segment. Finally, we provide empirical evidence that our algorithm is more accurate than the existing baselines, in terms of train and test label error.
Cites work
- scientific article; zbMATH DE number 3146392 (Why is no real title available?)
- scientific article; zbMATH DE number 3444596 (Why is no real title available?)
- A Cluster Analysis Method for Grouping Means in the Analysis of Variance
- A Modified Bayes Information Criterion with Applications to the Analysis of Comparative Genomic Hybridization Data
- Algorithms for the optimal identification of segment neighborhoods
- CONTINUOUS INSPECTION SCHEMES
- Constrained dynamic programming and supervised penalty learning algorithms for peak detection in genomic data
- Estimating the dimension of a model
- Estimating the number of change-points via Schwarz' criterion
- Greedy Kernel Change-Point Detection
- On optimal multiple changepoint algorithms for large data
- Optimal detection of changepoints with a linear computational cost
This page was built for publication: Labeled Optimal Partitioning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q136548)