Dimensionality-dependent generalization bounds for k-dimensional coding schemes

DOI10.1162/NECO_A_00872zbMATH Open1474.68315arXiv1601.00238OpenAlexW2223051473WikidataQ39621250 ScholiaQ39621250MaRDI QIDQ5380581FDOQ5380581

Authors: Tongliang Liu, Dacheng Tao, Dong Xu

Publication date: 5 June 2019

Published in: Neural Computation (Search for Journal in Brave)

Abstract: The

k

-dimensional coding schemes refer to a collection of methods that attempt to represent data using a set of representative

k

-dimensional vectors, and include non-negative matrix factorization, dictionary learning, sparse coding,

k

-means clustering and vector quantization as special cases. Previous generalization bounds for the reconstruction error of the

k

-dimensional coding schemes are mainly dimensionality independent. A major advantage of these bounds is that they can be used to analyze the generalization error when data is mapped into an infinite- or high-dimensional feature space. However, many applications use finite-dimensional data features. Can we obtain dimensionality-dependent generalization bounds for

k

-dimensional coding schemes that are tighter than dimensionality-independent bounds when data is in a finite-dimensional feature space? The answer is positive. In this paper, we address this problem and derive a dimensionality-dependent generalization bound for

k

-dimensional coding schemes by bounding the covering number of the loss function class induced by the reconstruction error. The bound is of order

m a t h c a l O l e f t (l e f t (m k l n (m k n) / n i g h t)^{l a m b d a_{n}} i g h t)

, where

m

is the dimension of features,

k

is the number of the columns in the linear implementation of coding schemes,

n

is the size of sample,

l a m b d a_{n} > 0.5

when

n

is finite and

l a m b d a_{n} = 0.5

when

n

is infinite. We show that our bound can be tighter than previous results, because it avoids inducing the worst-case upper bound on

k

of the loss function and converges faster. The proposed generalization bound is also applied to some specific coding schemes to demonstrate that the dimensionality-dependent bound is an indispensable complement to these dimensionality-independent generalization bounds.

Full work available at URL: https://arxiv.org/abs/1601.00238

Recommendations

Mathematics Subject Classification ID

Computational aspects of data analysis and big data (68T09)

Cites Work

Cited In (5)

Uses Software

This page was built for publication: Dimensionality-dependent generalization bounds for \(k\)-dimensional coding schemes

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5380581)