Outlier detection under interval uncertainty: algorithmic solvability and computational complexity (Q2484030): Difference between revisions

The purpose of the present paper is to investigate outlier detection problems under the interval uncertainty approach, to provide an analysis of their computational complexity, to obtain new efficient algorithms that solve some of these problems (that are tractable under reasonable conditions), and to take into consideration related open problems. The traditional engineering approach to outlier detection takes the following steps: (a) The measurement, ``normal'' values \(x_1, x_2, \dots, x_n\) are collected. (b) For these normal values the sample average \(E = (x_1 + \cdots+ x_n) / n\) and the standard deviation \(\sigma = \sqrt V\) are computed, where \(V = [(x_1 - E)^2 +\cdots + (x_n - E)^2] / n.\) (c) A new measurement result \(x\) is classified as an outlier if \(x\) is outside the \(k_0\sigma\) interval \([L, U]\), where \(L = E - k_0\sigma, U = E + k_0\sigma\), and \(k_0 > 1\) is some preselected parameter (most frequently, \(k_0 = 2, 3\), or 6). In real life, the normal values \(x_1,\dots, x_n\) are situated within certain interval ranges, and for different values of \(x_i\) \((i = 1,\dots,n)\) within its interval, one get different bounds \(L\) and \(U\) of the \(k_0\sigma\) interval. Detecting now the outliers requires to obtain: (1) the possible outliers, defined as located outside of (at least) one of the possible \(k_0\sigma\) intervals \([L, U]\), and (2) the guaranteed outliers, defined as being located outside of all possible \(k_0\sigma\) intervals \([L, U]\). It is thus essential to compute the exact ranges of the interval bounds \(L\) and \(U\). The main results may now be summarized as follows: (i) Computing the exact ranges of the outlier interval bounds \(L\) and \(U\) is proved to be, in the general case, an NP-hard problem. (ii) The authors propose efficient (quadratic-time) algorithms that compute the ranges of \(L\) and \(U\) under reasonable conditions. (iii) Once a value is identified as an outlier for a fixed parameter value \(k_0\), the paper shows how to find out to what degree this value is an outlier, i.e. what is the largest value \(k_0\) for which this value is an outlier for sure.

0 references

reviewed by

Neculai Curteanu

0 references

zbMATH Keywords

outlier detection problem

0 references

interval uncertainty approach

0 references

computational complexity

0 references

possible outliers

0 references

guaranteed outliers

0 references

interval ranges

0 references

exact ranges of interval bounds

0 references

efficient quadratic-time algorithms

0 references

nonlinear optimization

0 references

Identifiers

zbMATH Open document ID

1076.65014

0 references

DOI

10.1007/s11155-005-5943-7

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:2484030

@@ Property / author @@
-Vladik Ya. Kreinovich
@@ Property / author: Vladik Ya. Kreinovich / rank @@
-Normal rank
@@ Property / author @@
-Lev R. Ginzburg
@@ Property / author: Lev R. Ginzburg / rank @@
-Normal rank