An improved data stream algorithm for clustering (Q904105)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | An improved data stream algorithm for clustering |
scientific article |
Statements
An improved data stream algorithm for clustering (English)
0 references
15 January 2016
0 references
The paper deals with the problem of the \(k\)-center problem, i.e., given a set of points \(P\) in a metric space and an integer \(k\), we want to find \(k\) points \(c_1, c_2, \dots, c_k\) such that \(\max_{p \in P} \min_{1\leq i \leq k} |p c_i|\) is minimized, where \(|pq|\) denotes the distance between \(p\) and \(q\). The authors make some additional assumptions, namely they take the \(d\)-dimensional Euclidean metric space \(\mathbb R^d\). Moreover, they assume that the points arrive one by one in a stream and we are allowed to examine each point only once, and only a small amount of information can be stored in a device. The authors start with giving a single-pass data stream algorithm that for any \(d \geq 1\) uses the \(O(d / \varepsilon)\) space and returns two centers together with their radii that guarantee a \((1.8 + \varepsilon)\)-approximation for the Euclidean \(2\)-center problem. Each update in the algorithm takes \(O(d / \varepsilon)\) time. Later, they extend the algorithm to the Euclidean \(k\)-center problem. They design an approximation algorithm that for some constant \(k > 2\) and for any dimension \(d \geq 1\) uses \(O(2^k (k+3)! d / \varepsilon )\) space and returns \(k\) centers that guarantee a \((1.8 + \varepsilon)\)-approximation. Each update in the algorithm takes \(O(2^k (k+2)! d / \varepsilon)\) time.
0 references
Euclidean \(k\)-clustering
0 references
Euclidean \(k\)-center problem
0 references
streaming algorithms
0 references
computational geometry
0 references