Universal regular conditional distributions via probabilistic transformers

From MaRDI portal

Publication:6101232

Jump to:navigation, search

DOI10.1007/S00365-023-09635-3zbMATH Open1529.68277arXiv2105.07743OpenAlexW4361018823MaRDI QIDQ6101232FDOQ6101232

Authors: Anastasis Kratsios

Publication date: 20 June 2023

Published in: Constructive Approximation (Search for Journal in Brave)

Abstract: We introduce a deep learning model that can universally approximate regular conditional distributions (RCDs). The proposed model operates in three phases: first, it linearizes inputs from a given metric space

m a t h c a l X

to

m a t h b b R^{d}

via a feature map, then a deep feedforward neural network processes these linearized features, and then the network's outputs are then transformed to the

1

-Wasserstein space

m a t h c a l P_{1} (m a t h b b R^{D})

via a probabilistic extension of the attention mechanism of Bahdanau et al. (2014). Our model, called the extit{probabilistic transformer (PT)}, can approximate any continuous function from

m a t h b b R^{d}

to

m a t h c a l P_{1} (m a t h b b R^{D})

uniformly on compact sets, quantitatively. We identify two ways in which the PT avoids the curse of dimensionality when approximating

m a t h c a l P_{1} (m a t h b b R^{D})

-valued functions. The first strategy builds functions in

C (m a t h b b R^{d}, m a t h c a l P_{1} (m a t h b b R^{D}))

which can be efficiently approximated by a PT, uniformly on any given compact subset of

m a t h b b R^{d}

. In the second approach, given any function

f

in

C (m a t h b b R^{d}, m a t h c a l P_{1} (m a t h b b R^{D}))

, we build compact subsets of

m a t h b b R^{d}

whereon

f

can be efficiently approximated by a PT.

Full work available at URL: https://arxiv.org/abs/2105.07743

Recommendations

zbMATH Keywords

computational optimal transport transformers universal approximation geometric deep learning regular conditional distributions measure-valued neural networks

Mathematics Subject Classification ID

Probability distributions: general theory (60E05) Optimal transportation (49Q22) Artificial neural networks and deep learning (68T07) Approximation by other special function classes (41A30) Integration and disintegration of measures (28A50)

Cites Work

Cited In (2)

This page was built for publication: Universal regular conditional distributions via probabilistic transformers

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6101232)

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:6101232&oldid=35546971"