Optimal transport natural gradient for statistical manifolds with continuous sample space
From MaRDI portal
Publication:2006192
Abstract: We study the Wasserstein natural gradient in parametric statistical models with continuous sample spaces. Our approach is to pull back the -Wasserstein metric tensor in the probability density space to a parameter space, equipping the latter with a positive definite metric tensor, under which it becomes a Riemannian manifold, named the Wasserstein statistical manifold. In general, it is not a totally geodesic sub-manifold of the density space, and therefore its geodesics will differ from the Wasserstein geodesics, except for the well-known Gaussian distribution case, a fact which can also be validated under our framework. We use the sub-manifold geometry to derive a gradient flow and natural gradient descent method in the parameter space. When parametrized densities lie in , the induced metric tensor establishes an explicit formula. In optimization problems, we observe that the natural gradient descent outperforms the standard gradient descent when the Wasserstein distance is the objective function. In such a case, we prove that the resulting algorithm behaves similarly to the Newton method in the asymptotic regime. The proof calculates the exact Hessian formula for the Wasserstein distance, which further motivates another preconditioner for the optimization process. To the end, we present examples to illustrate the effectiveness of the natural gradient in several parametric statistical models, including the Gaussian measure, Gaussian mixture, Gamma distribution, and Laplace distribution.
Recommendations
Cites work
- scientific article; zbMATH DE number 3761167 (Why is no real title available?)
- scientific article; zbMATH DE number 2152346 (Why is no real title available?)
- A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem
- An optimal transport approach for seismic tomography: application to 3D full waveform inversion
- Application of the Wasserstein metric to seismic signals
- Computational optimal transport. With applications to data sciences
- Constrained steepest descent in the 2-Wasserstein metric
- Geometry of matrix decompositions seen through optimal transport and information geometry
- Information geometry
- Information geometry and its applications
- Information geometry connecting Wasserstein distance and Kullback-Leibler divergence via the entropy-relaxed transportation problem
- Information-geometric optimization algorithms: a unifying picture via invariance principles
- Large-scale dynamics of mean-field games driven by local Nash equilibria
- Logarithmic divergences from optimal transport and Rényi geometry
- Natural gradient flow in the mixture geometry of a discrete exponential family
- Natural gradient via optimal transport
- Online natural gradient as a Kalman filter
- Optimal algorithms for online scheduling with bounded rearrangement at the end
- Optimal transport for seismic full waveform inversion
- Population games and discrete optimal transport
- Ricci curvature for metric-measure spaces via optimal transport
- Robust estimation of natural gradient in optimization by regularized linear regression
- Some geometric calculations on Wasserstein space
- Stability of a 4th-order curvature condition arising in optimal transport theory
- THE GEOMETRY OF DISSIPATIVE EVOLUTION EQUATIONS: THE POROUS MEDIUM EQUATION
- The Density Manifold and Configuration Space Quantization
- The quadratic Wasserstein metric for earthquake location
- Wasserstein Riemannian geometry of Gaussian densities
- Wasserstein geometry of Gaussian measures
Cited in
(13)- scientific article; zbMATH DE number 7192338 (Why is no real title available?)
- Quantum statistical learning via quantum Wasserstein natural gradient
- Natural gradient via optimal transport
- High order spatial discretization for variational time implicit schemes: Wasserstein gradient flows and reaction-diffusion systems
- Mean-field and kinetic descriptions of neural differential equations
- When optimal transport meets information geometry
- Efficient Natural Gradient Descent Methods for Large-Scale PDE-Based Optimization Problems
- Natural gradient for combined loss using wavelets
- Wasserstein information matrix
- Information geometry connecting Wasserstein distance and Kullback-Leibler divergence via the entropy-relaxed transportation problem
- Affine natural proximal learning
- Information geometry of Wasserstein statistics on shapes and affine deformations
- Lagrangian and Hamiltonian dynamics for probabilities on the statistical bundle
This page was built for publication: Optimal transport natural gradient for statistical manifolds with continuous sample space
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2006192)