Intrinsic dimension of geometric data sets (Q2075416)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Intrinsic dimension of geometric data sets |
scientific article |
Statements
Intrinsic dimension of geometric data sets (English)
0 references
14 February 2022
0 references
Following \textit{V. Pestov}'s suggestions in [``Intrinsic dimension of a dataset: what properties does one expect?'', in: Proceedings of the international joint conference on neural networks, IJCNN 2007, celebrating 20 years of neural networks, Orlando, Florida, USA, August 12--17, 2007, 2959--2964 (2007)], the authors extend Pestov's results in [\textit{V. Pestov}, Inf. Process. Lett. 73, No. 1--2, 47--51 (2000; Zbl 1339.68245); Neural Netw. 21, No. 2--3, 204--213 (2008; Zbl 1254.68102)] to geometric data sets. A geometric data set is a triple \(D = (X, F, \mu)\) consisting of a set \(X\) equipped with a tame set \(F\subset \mathbb{R}^X\) such that \((X, d_F)\) is a separable complete metric space, and a probability measure \(\mu\) with full support on the Borel \(\sigma\)-algebra of \((X, d_F)\). Here \(d_F(x, y):=\sup\{|f(x)-f(y) \mid f\in F\}\) and the set of functions \(F\) is called tame if \(d_F(x,y)\leq \infty\) for all \(x, y \in X\). After showing that observable distance is a metric on the set of isomorphic geometric data sets, the authors consider the concentration and observable diameters of data. In the last two parts of the paper, the authors define a dimension function as an axiomatic approach to intrinsic dimension of geometric data sets and compute the dimension function for data sets in \(\mathbb{R}^n\) and data sets resembling incidence structures.
0 references
geometric data sets
0 references
observable distance
0 references
dimension function
0 references
dimension curse
0 references
machine learning
0 references
0 references
0 references