Anomaly Detection in High Dimensional Data (Q74767): Difference between revisions

This article introduces a novel algorithm for detecting anomalies in high-dimensional data, known as the stray algorithm. Developed to overcome limitations in the performance of existing algorithms like HDoutliers, this method identifies anomalies based on extreme value theory by calculating thresholds for large distance gaps between observations. Extensive testing with both synthetic and real datasets has demonstrated that the stray algorithm not only outperforms its predecessor but also excels in terms of accuracy and computational efficiency. The stray algorithm is available as an open-source R package, further highlighting its versatility and potential impact on anomaly detection methods. (English)

0 references

Identifiers

arXiv ID

1908.04000

0 references

DOI

10.48550/arXiv.1908.04000

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:74767

Revision as of 22:39, 24 November 2024 Tconrad (talk \| contribs) Bureaucrats, Interface administrators, private, Suppressors, Administrators 1,901 edits ‎Removed claim: summary_simple (P1639): Hey little buddy! Imagine you have a big box filled with colorful marbles, but one or two are different from the others. We need to find out which ones they are so we can separate them. In this story, people made a special tool called "the stray algorithm" that helps us see if any of our special marbles (called anomalies) stick out and need our attention. They compared their new method with an old one called HDoutliers, and guess what? Their... Tag: Manual revert ← Older edit	Revision as of 22:39, 24 November 2024 Tconrad (talk \| contribs) Bureaucrats, Interface administrators, private, Suppressors, Administrators 1,901 edits ‎Created claim: summary (P1638): This article introduces a novel algorithm for detecting anomalies in high-dimensional data, known as the stray algorithm. Developed to overcome limitations in the performance of existing algorithms like HDoutliers, this method identifies anomalies based on extreme value theory by calculating thresholds for large distance gaps between observations. Extensive testing with both synthetic and real datasets has demonstrated that the stray algorithm... Newer edit →
	Property / summary
		This article introduces a novel algorithm for detecting anomalies in high-dimensional data, known as the stray algorithm. Developed to overcome limitations in the performance of existing algorithms like HDoutliers, this method identifies anomalies based on extreme value theory by calculating thresholds for large distance gaps between observations. Extensive testing with both synthetic and real datasets has demonstrated that the stray algorithm not only outperforms its predecessor but also excels in terms of accuracy and computational efficiency. The stray algorithm is available as an open-source R package, further highlighting its versatility and potential impact on anomaly detection methods. (English)
	Property / summary: This article introduces a novel algorithm for detecting anomalies in high-dimensional data, known as the stray algorithm. Developed to overcome limitations in the performance of existing algorithms like HDoutliers, this method identifies anomalies based on extreme value theory by calculating thresholds for large distance gaps between observations. Extensive testing with both synthetic and real datasets has demonstrated that the stray algorithm not only outperforms its predecessor but also excels in terms of accuracy and computational efficiency. The stray algorithm is available as an open-source R package, further highlighting its versatility and potential impact on anomaly detection methods. (English) / rank
		Normal rank