Universal regular conditional distributions via probabilistic transformers

From MaRDI portal
Publication:6101232

DOI10.1007/S00365-023-09635-3zbMATH Open1529.68277arXiv2105.07743OpenAlexW4361018823MaRDI QIDQ6101232FDOQ6101232


Authors: Anastasis Kratsios Edit this on Wikidata


Publication date: 20 June 2023

Published in: Constructive Approximation (Search for Journal in Brave)

Abstract: We introduce a deep learning model that can universally approximate regular conditional distributions (RCDs). The proposed model operates in three phases: first, it linearizes inputs from a given metric space mathcalX to mathbbRd via a feature map, then a deep feedforward neural network processes these linearized features, and then the network's outputs are then transformed to the 1-Wasserstein space mathcalP1(mathbbRD) via a probabilistic extension of the attention mechanism of Bahdanau et al. (2014). Our model, called the extit{probabilistic transformer (PT)}, can approximate any continuous function from mathbbRd to mathcalP1(mathbbRD) uniformly on compact sets, quantitatively. We identify two ways in which the PT avoids the curse of dimensionality when approximating mathcalP1(mathbbRD)-valued functions. The first strategy builds functions in C(mathbbRd,mathcalP1(mathbbRD)) which can be efficiently approximated by a PT, uniformly on any given compact subset of mathbbRd. In the second approach, given any function f in C(mathbbRd,mathcalP1(mathbbRD)), we build compact subsets of mathbbRd whereon f can be efficiently approximated by a PT.


Full work available at URL: https://arxiv.org/abs/2105.07743







Cites Work


Cited In (1)





This page was built for publication: Universal regular conditional distributions via probabilistic transformers

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6101232)