GSNs: generative stochastic networks
From MaRDI portal
Publication:4603726
DOI10.1093/IMAIAI/IAW003zbMATH Open1380.68328arXiv1503.05571OpenAlexW2159528849MaRDI QIDQ4603726FDOQ4603726
Authors: Guillaume Alain, Yoshua Bengio, Li Yao, Jason Yosinski, Éric Thibodeau-Laufer, Saizheng Zhang, Pascal Vincent
Publication date: 19 February 2018
Published in: Information and Inference: A Journal of the IMA (Search for Journal in Brave)
Abstract: We introduce a novel training principle for probabilistic models that is an alternative to maximum likelihood. The proposed Generative Stochastic Networks (GSN) framework is based on learning the transition operator of a Markov chain whose stationary distribution estimates the data distribution. Because the transition distribution is a conditional distribution generally involving a small move, it has fewer dominant modes, being unimodal in the limit of small moves. Thus, it is easier to learn, more like learning to perform supervised function approximation, with gradients that can be obtained by back-propagation. The theorems provided here generalize recent work on the probabilistic interpretation of denoising auto-encoders and provide an interesting justification for dependency networks and generalized pseudolikelihood (along with defining an appropriate joint distribution and sampling mechanism, even when the conditionals are not consistent). We study how GSNs can be used with missing inputs and can be used to sample subsets of variables given the rest. Successful experiments are conducted, validating these theoretical results, on two image datasets and with a particular architecture that mimics the Deep Boltzmann Machine Gibbs sampler but allows training to proceed with backprop, without the need for layerwise pretraining.
Full work available at URL: https://arxiv.org/abs/1503.05571
Recommendations
- What regularized auto-encoders learn from the data-generating distribution
- A Connection Between Score Matching and Denoising Autoencoders
- Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion
- Generative modeling of convolutional neural networks
- Representational Power of Restricted Boltzmann Machines and Deep Belief Networks
Cites Work
- Learning deep architectures for AI
- Perturbation theory and finite Markov chains
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- Dependency networks for inference, collaborative filtering, and data visualization
- On the convergence of markovian stochastic algorithms with rapidly decreasing ergodicity rates
- On invariance and selectivity in representation learning
- A Fast Learning Algorithm for Deep Belief Nets
- Comparison of perturbation bounds for the stationary distribution of a Markov chain
- Consistency of Pseudolikelihood Estimation of Fully Visible Boltzmann Machines
- Enhanced Gradient for Training Restricted Boltzmann Machines
- Title not available (Why is that?)
- Deep Haar scattering networks
Cited In (4)
Uses Software
This page was built for publication: GSNs: generative stochastic networks
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4603726)