scientific article

Ilya Sutskever, Nitish Srivastava, Alex Krizhevsky, Ruslan R. Salakhutdinov, Geoffrey E. Hinton

Publication date: 8 December 2014

Full work available at URL: http://jmlr.csail.mit.edu/papers/v15/srivastava14a.html

Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.

zbMATH Keywords

regularization networks model combination deep learning

Mathematics Subject Classification ID

Classification and discrimination; cluster analysis (statistical aspects) (62H30) Learning and adaptive systems in artificial intelligence (68T05)

Related Items

Hierarchical binding in convolutional neural networks: making adversarial attacks geometrically challenging, Extracting and inserting knowledge into stacked denoising auto-encoders, A survey on modern trainable activation functions, Manifold adversarial training for supervised and semi-supervised learning, Recurrent and convolutional neural networks in structural dynamics: a modified attention steered encoder-decoder architecture versus LSTM versus GRU versus TCN topologies to predict the response of shock wave-loaded plates, FCM-RDpA: TSK fuzzy regression model construction using fuzzy C-means clustering, regularization, droprule, and powerball adabelief, Impact of coronavirus disease 2019 on electricity demand and the unit commitment problem: a long–short-term memory-based machine learning approach, Hierarchically structured task-agnostic continual learning, Spiking recurrent neural networks for neuromorphic computing in nonlinear structural mechanics, Reconstruction of incomplete wildfire data using deep generative models, Reconstruction of proper numerical inlet boundary conditions for draft tube flow simulations using machine learning, Universal regular conditional distributions via probabilistic transformers, Quasi-optimal \textit{hp}-finite element refinements towards singularities via deep neural network prediction, Is there a role for statistics in artificial intelligence?, Assessing similarities between spatial point patterns with a siamese neural network discriminant model, Solving an inverse source problem by deep neural network method with convergence and error analysis, Consistent Sparse Deep Learning: Theory and Computation, Addressing discontinuous root-finding for subsequent differentiability in machine learning, inverse problems, and control, Generalized Lyapunov exponents and aspects of the theory of deep learning, Physics-based self-learning spiking neural network enhanced time-integration scheme for computing viscoplastic structural finite element response, Energy-Based Models with Applications to Speech and Language Processing, Exploring weight initialization, diversity of solutions, and degradation in recurrent neural networks trained for temporal and decision-making tasks, Deep convolutional Ritz method: parametric PDE surrogates without labeled data, Deep learning for natural language processing: a survey, Learning to increase the power of conditional randomization tests, Pruning during training by network efficacy modeling, Weighted neural tangent kernel: a generalized and improved network-induced kernel, The role of mutual information in variational classifiers, Semi-Supervised Node Classification via Semi-Global Graph Transformer Based on Homogeneity Augmentation, TRELM-DROP: an impavement non-iterative algorithm for traffic flow forecast, Combinatorial designs for deep learning, Modeling epidemics: neural network based on data and SIR-model, Fragility, robustness and antifragility in deep learning, A neural tensor decomposition model for high-order sparse data recovery, Quality measures for the evaluation of machine learning architectures on the quantification of epistemic and aleatoric uncertainties in complex dynamical systems, Boundary-safe PINNs extension: application to non-linear parabolic PDEs in counterparty credit risk, Deep learning approach to Hubble parameter, Stochastic perturbation of subgradient algorithm for nonconvex deep neural networks, Towards real-time fluid dynamics simulation: a data-driven NN-MPS method and its implementation, Deep Learning in the Natural Sciences: Applications to Physics, Optimal deep neural networks by maximization of the approximation power, Neural dynamic mode decomposition for end-to-end modeling of nonlinear dynamics, Optimization Design of Laminated Functionally Carbon Nanotube-Reinforced Composite Plates Using Deep Neural Networks and Differential Evolution, A viable framework for semi-supervised learning on realistic dataset, Ranks of elliptic curves and deep neural networks, Heterogeneity in Neuronal Dynamics Is Learned by Gradient Descent for Temporal Processing Tasks, Geometric deep learning: a temperature based analysis of graph neural networks, Probabilistic physics-guided transfer learning for material property prediction in extrusion deposition additive manufacturing, A self-adaptive fuzzy learning system for streaming data prediction, Evolving stochastic configure network: a more compact model with interpretability, Neural logic rule layers, On mathematical modeling in image reconstruction and beyond, Empirical prior based probabilistic inference neural network for policy learning, Consolidation of structure of high noise data by a new noise index and reinforcement learning, An Interpretive Constrained Linear Model for ResNet and MgNet, ExSpliNet: An interpretable and expressive spline-based neural network, Comments on: ``A random forest guided tour, The relative performance of ensemble methods with deep convolutional neural networks for image classification, Modeling the Ventral and Dorsal Cortical Visual Pathways Using Artificial Neural Networks, Effects of depth, width, and initialization: A convergence analysis of layer-wise training for deep linear neural networks, Modeling surrender risk in life insurance: theoretical and experimental insight, Applying a 1D-CNN Network to Electricity Load Forecasting, Application of Deep Learning to Seizure Classification, Mean-field inference methods for neural networks, High-Dimensional Learning Under Approximate Sparsity with Applications to Nonsmooth Estimation and Regularized Neural Networks, Unnamed Item, RANDOM NEURAL NETWORK METHODS AND DEEP LEARNING, What Kinds of Functions Do Deep Neural Networks Learn? Insights from Variational Spline Theory, Unnamed Item, Unnamed Item, Reproducible Hyperparameter Optimization, Detecting Product Adoption Intentions via Multiview Deep Learning, SHEDR: An End-to-End Deep Neural Event Detection and Recommendation Framework for Hyperlocal News Using Social Media, Visual transfer for reinforcement learning via gradient penalty based Wasserstein domain confusion, An unsupervised deep learning approach to solving partial integro-differential equations, Large Sample Mean-Field Stochastic Optimization, A three-dimensional prediction method of stiffness properties of composites based on deep learning, Sequential classification of customer behavior based on sequence-to-sequence learning with gated-attention neural networks, Evaluating deep transfer learning for whole-brain cognitive decoding, Quantifying the separability of data classes in neural networks, Block-cyclic stochastic coordinate descent for deep neural networks, How to handle noisy labels for robust learning from uncertainty, Extremely randomized neural networks for constructing prediction intervals, Sparsity-control ternary weight networks, Graph deep learning model for mapping mineral prospectivity, A micromechanics‐based recurrent neural networks model for path‐dependent cyclic deformation of short fiber composites, Horseshoe Regularisation for Machine Learning in Complex and Deep Models¹, Priors in Bayesian Deep Learning: A Review, Backpropagation neural tree, Applications of limiters, neural networks and polynomial annihilation in higher-order FD/FV schemes, A noise-based stabilizer for convolutional neural networks, A neural network approach to predicting and computing knot invariants, Concept Cloud-Based Sentiment Visualization for Financial Reviews, The Discriminative Kalman Filter for Bayesian Filtering with Nonlinear and Nongaussian Observation Models, The Stochastic Delta Rule: Faster and More Accurate Deep Learning Through Adaptive Weight Noise, On Kernel Method–Based Connectionist Models and Supervised Deep Learning Without Backpropagation, Continuous Online Sequence Learning with an Unsupervised Neural Network Model, Learning the Structural Vocabulary of a Network, Deep Learning with Dynamic Spiking Neurons and Fixed Feedback Weights, Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review, Turbulent scalar flux in inclined jets in crossflow: counter gradient transport and deep learning modelling, Electricity Price Forecasting with Neural Networks on EPEX Order Books, Semisupervised Deep Stacking Network with Adaptive Learning Rate Strategy for Motor Imagery EEG Recognition, Sparse Deep Neural Networks Using L1,∞-Weight Normalization, Why Does Large Batch Training Result in Poor Generalization? A Comprehensive Explanation and a Better Strategy from the Viewpoint of Stochastic Optimization, Neural Simpletrons: Learning in the Limit of Few Labels with Directed Generative Networks, Unnamed Item, Unnamed Item, Unnamed Item, Unnamed Item, Unnamed Item, Deep learning of mixing by two ‘atoms’ of stratified turbulence, Unnamed Item, Unnamed Item, Shallow neural networks for fluid flow reconstruction with limited sensors, Locally adaptive activation functions with slope recovery for deep and physics-informed neural networks, Dying ReLU and Initialization: Theory and Numerical Examples, Implicit Deep Learning, Deep Learning at the Interface of Agricultural Insurance Risk and Spatio-Temporal Uncertainty in Weather Extremes, Time-series forecasting of mortality rates using deep learning, Neural networks with dynamical coefficients and adjustable connections on the basis of integrated backpropagation, Neural network embedding of the over-dispersed Poisson reserving model, Neural network model for multimodal recognition of human aggression, A NEURAL NETWORK BOOSTED DOUBLE OVERDISPERSED POISSON CLAIMS RESERVING MODEL, Convolutional autoencoder and conditional random fields hybrid for predicting spatial-temporal chaos, Color-mapped contour gait image for cross-view gait recognition using deep convolutional neural network, Facial expression recognition based on Gabor wavelet transform and 2-channel CNN, Unnamed Item, DNN-PPI: A LARGE-SCALE PREDICTION OF PROTEIN–PROTEIN INTERACTIONS BASED ON DEEP NEURAL NETWORKS, fPINNs: Fractional Physics-Informed Neural Networks, Aesthetic Discrimination of Graph Layouts, Deep learning for limit order books, Calibrating rough volatility models: a convolutional neural network approach, Partial differential equation regularization for supervised machine learning, Unnamed Item, Unnamed Item, Unnamed Item, An EM Algorithm for Capsule Regression, Joint Structure and Parameter Optimization of Multiobjective Sparse Neural Network, On the Achievability of Blind Source Separation for High-Dimensional Nonlinear Source Mixtures, Entropy-SGD: biasing gradient descent into wide valleys, Veridical data science, Unnamed Item, Condition Monitoring of Equipment in Oil Wells using Deep Learning, Unnamed Item, Graph interpolating activation improves both natural and robust accuracies in data-efficient deep learning, PDE-Aware Deep Learning for Inverse Problems in Cardiac Electrophysiology, Multi-Objective Optimization of Laminated Functionally Graded Carbon Nanotube-Reinforced Composite Plates Using Deep Feedforward Neural Networks-NSGAII Algorithm, CALIBRATING THE LEE-CARTER AND THE POISSON LEE-CARTER MODELS VIA NEURAL NETWORKS, Approximation of discontinuous inverse operators with neural networks, Matrix factorization with dual-network collaborative embedding for social recommendation, A jamming transition from under- to over-parametrization affects generalization in deep learning, Multilinear Compressive Sensing and an Application to Convolutional Linear Networks, Fast Convex Pruning of Deep Neural Networks, A Priori Sub-grid Modelling Using Artificial Neural Networks, Global optimization issues in deep network regression: an overview, A stochastic subgradient method for distributionally robust non-convex and non-smooth learning, Lipschitzness is all you need to tame off-policy generative adversarial imitation learning, JGPR: a computationally efficient multi-target Gaussian process regression algorithm, Interpreting rate-distortion of variational autoencoder and using model uncertainty for anomaly detection, Benchmarking penalized regression methods in machine learning for single cell RNA sequencing data, Deep calibration of financial models: turning theory into practice, Brain-Inspired Constructive Learning Algorithms with Evolutionally Additive Nonlinear Neurons, Genuinely distributed Byzantine machine learning, High-efficiency chaotic time series prediction based on time convolution neural network, CLMB: deep contrastive learning for robust metagenomic binning, A survey of deep network techniques all classifiers can adopt, Actuarial intelligence in auto insurance: claim frequency modeling with driving behavior features and improved boosted trees, Drop-activation: implicit parameter reduction and harmonious regularization, Learning in the machine: the symmetries of the deep learning channel, Accelerating multiscale finite element simulations of history-dependent materials using a recurrent neural network, A deep learning-based hybrid approach for the solution of multiphysics problems in electrosurgery, GXNOR-Net: training deep neural networks with ternary weights and activations without full-precision memory under a unified discretization framework, A survey on deep matrix factorizations, Data-driven modelling of the Reynolds stress tensor using random forests with invariance, Artificial neural networks in structural dynamics: a new modular radial basis function approach vs. convolutional and feedforward topologies, Computational graph completion, Hierarchical clustering with deep q-learning, Equivalence between dropout and data augmentation: a mathematical check, Scalable uncertainty quantification for deep operator networks using randomized priors, Understanding autoencoders with information theoretic concepts, Robust manifold broad learning system for large-scale noisy chaotic time series prediction: a perturbation perspective, Transformed \(\ell_1\) regularization for learning sparse deep neural networks, Data science applications to string theory, A stochastic variational framework for recurrent Gaussian processes models, Deep limits of residual neural networks, Micromechanics-based surrogate models for the response of composites: a critical comparison between a classical mesoscale constitutive model, hyper-reduction and neural networks, Accelerating algebraic multigrid methods via artificial neural networks, Two-point step size gradient method for solving a deep learning problem, Uncertainty quantification in scientific machine learning: methods, metrics, and comparisons, Seismic Bayesian evidential learning: estimation and uncertainty quantification of sub-resolution reservoir properties, Convolutional neural networks (CNN) for feature-based model calibration under uncertain geologic scenarios, Novel convolutional neural network architecture for improved pulmonary nodule classification on computed tomography, Stochastic quantization for learning accurate low-bit deep neural networks, Individualized risk assessment of preoperative opioid use by interpretable neural network regression, DeepBND: a machine learning approach to enhance multiscale solid mechanics, Privacy preserving multi-party computation delegation for deep learning in cloud computing, A multimodal parallel method for left ventricular dysfunction identification based on phonocardiogram and electrocardiogram signals synchronous analysis, Impact of random weights on nonlinear system identification using convolutional neural networks, On the rate of convergence of a deep recurrent neural network estimate in a regression problem with dependent data, Propositionalization and embeddings: two sides of the same coin, Frame regularization of a convolutional neural network in image-classification problems, Control of partial differential equations via physics-informed neural networks, Affine-invariant ensemble transform methods for logistic regression, Automated feature selection procedure for particle jet classification, Improve robustness and accuracy of deep neural network with \(L_{2,\infty}\) normalization, Forecasting of nonlinear dynamics based on symbolic invariance, Neural-net-induced Gaussian process regression for function approximation and PDE solution, Regularized greedy column subset selection, Nonparametric regression using deep neural networks with ReLU activation function, Machine learning entanglement freedom, Experimental Analysis of the Accessibility of Drawings with Few Segments, Accelerating flash calculation through deep learning methods, Deep learning of dynamics and signal-noise decomposition with time-stepping constraints, Quantifying total uncertainty in physics-informed neural networks for solving forward and inverse stochastic problems, Machine learning for fast and reliable solution of time-dependent differential equations, Deep learning approach to the detection of scattering delay in radar images, Efficient and data-driven prediction of water breakthrough in subsurface systems using deep long short-term memory machine learning, Machine learning for accelerating macroscopic parameters prediction for poroelasticity problem in stochastic media, The Architectures of Geoffrey Hinton, Design of Load Forecast Systems Resilient Against Cyber-Attacks, Incorporating grain-scale processes in macroscopic sediment transport models. A review and perspectives for environmental and geophysical applications, Backtracking gradient descent method and some applications in large scale optimisation. II: Algorithms and experiments, \(\beta\)-Variational autoencoder as an entanglement classifier, Instance weighting through data imprecisiation, Surrogate modeling of elasto-plastic problems via long short-term memory neural networks and proper orthogonal decomposition, Seismic stratum segmentation using an encoder-decoder convolutional neural network, Physically-constrained data-driven inversions to infer the bed topography beneath glaciers flows. Application to East Antarctica, Joint sentence and aspect-level sentiment analysis of product comments, Theoretical investigation of generalization bounds for adversarial learning of deep neural networks, Dropout fails to regularize nonparametric learners, Deep learning for quantile regression under right censoring: deepquantreg, Geometric uncertainty in patient-specific cardiovascular modeling with convolutional dropout networks, Surprising properties of dropout in deep networks, Unnamed Item, Neural networks for topology optimization, A survey of randomized algorithms for training neural networks, Class sparsity signature based restricted Boltzmann machine, Neural bag-of-features learning, Nonredundant sparse feature extraction using autoencoders with receptive fields clustering, Attention pooling-based convolutional neural network for sentence modelling, A new centrality measure of nodes in multilayer networks under the framework of tensor computation, Data-driven acceleration of multiscale methods for uncertainty quantification: application in transient multiphase flow in porous media, A survey on semi-supervised learning, Analysis of 20th century surface air temperature using linear dynamical modes, Computational mechanics enhanced by deep learning, Sparse kernel deep stacking networks, Deep relaxation: partial differential equations for optimizing deep neural networks, Syntax-aware entity representations for neural relation extraction, Generating quantitative product profile using char-word CNNs, A novel conjoint triad auto covariance (CTAC) coding method for predicting protein-protein interaction based on amino acid sequence, Sex: the power of randomization, Stacked-GRU based power system transient stability assessment method, Unnamed Item, Unnamed Item, Solving inverse problems using conditional invertible neural networks, Fluid sensing using microcantilevers: from physics-based modeling to deep learning, On the antiderivatives of \(x^p/(1 - x)\) with an application to optimize loss functions for classification with neural networks, Deep learning for simultaneous measurements of pressure and temperature using arch resonators, Multi-view graph convolutional networks with attention mechanism, A survey on deep learning and its applications, Sweep-Net: an artificial neural network for radiation transport solves, A review on instance ranking problems in statistical learning, Inclusion of domain-knowledge into GNNs using mode-directed inverse entailment, Symbolic DNN-tuner, Non-intrusive model reduction of large-scale, nonlinear dynamical systems using deep learning, Emotion recognition in talking-face videos using persistent entropy and neural networks, On better training the infinite restricted Boltzmann machines, Comparative analysis of the results of training a neural network with calculated weights and with random generation of the weights, Controlling oscillations in spectral methods by local artificial viscosity governed by neural networks, Solving inverse-PDE problems with physics-aware neural networks, Deep distribution regression, Effect of dual-convolutional neural network model fusion for aluminum profile surface defects classification and recognition, A data-driven framework for the stochastic reconstruction of small-scale features with application to climate data sets, Vampire with a brain is a good ITP hammer, Deep learning of CMB radiation temperature, Machine learning for fluid flow reconstruction from limited measurements, Prediction of hereditary cancers using neural networks, State estimation with limited sensors -- a deep learning based approach, A deep learning framework for constitutive modeling based on temporal convolutional network, Deep Gaussian process autoencoders for novelty detection, Analyzing business process anomalies using autoencoders, An improved fast iterative shrinkage thresholding algorithm with an error for image deblurring problem, Neural network training using \(\ell_1\)-regularization and bi-fidelity data, Automated porosity estimation using CT-scans of extracted core data, Deep learning for the partially linear Cox model, Transformer-based deep neural language modeling for construct-specific automatic item generation, Neural network for complex systems: theory and applications, Stochastic loss reserving with mixture density neural networks, Extensions of stability selection using subsamples of observations and covariates, Logitboost autoregressive networks, Reliability assessment of CNC machining center based on Weibull neural network, Deep learning in color: towards automated quark/gluon jet discrimination, Expected energy-based restricted Boltzmann machine for classification, Deep recurrent model for server load and performance prediction in data center, Preserving differential privacy in convolutional deep belief networks, An evaluation of linear and non-linear models of expressive dynamics in classical piano and symphonic music, A machine learning approach for efficient uncertainty quantification using multiscale methods, A quantum-implementable neural network model, Dropout training for SVMs with data augmentation, A hierarchical neural-network-based document representation approach for text classification, Aesthetic discrimination of graph layouts, Profiled power analysis attacks using convolutional neural networks with domain knowledge, Using isotope composition and other node attributes to predict edges in fish trophic networks, The design and implementation of cardiotocography signals classification algorithm based on neural network, Algorithms for drug sensitivity prediction, Learning in the machine: to share or not to share?, A comparative study for glioma classification using deep convolutional neural networks, CPINet: parameter identification of path-dependent constitutive model with automatic denoising based on CNN-LSTM, Deep neural networks, gradient-boosted trees, random forests: statistical arbitrage on the S\&P 500, Multilayered neural architectures evolution for computing sequences of orthogonal polynomials, Deep UQ: learning deep neural network surrogate models for high dimensional uncertainty quantification, A technical review of convolutional neural network-based mammographic breast cancer diagnosis, Deep learning for time series classification: a review, A non-cooperative meta-modeling game for automated third-party calibrating, validating and falsifying constitutive laws with parallelized adversarial attacks, Development of an algorithm for reconstruction of droplet history based on deposition pattern using computational fluid dynamics and convolutional neural network, Geometric deep learning for computational mechanics. I: Anisotropic hyperelasticity, BlackBox: generalizable reconstruction of extremal values from incomplete spatio-temporal data, Retail sales forecasting with meta-learning, Option valuation under no-arbitrage constraints with neural networks, Word-class embeddings for multiclass text classification, Dataset2Vec: learning dataset meta-features, A selective overview of deep learning, Dynamical robustness and its structural dependence in biological networks, A construction for circulant type dropout designs, Communication modulation signal recognition based on the deep multi-hop neural network, Material optimization of tri-directional functionally graded plates by using deep neural network and isogeometric multimesh design approach, Regularisation of neural networks by enforcing Lipschitz continuity, Protect privacy of deep classification networks by exploiting their generative power, MODES: model-based optimization on distributed embedded systems, Improving ENIGMA-style clause selection while learning from history, Handwritten mathematical expression recognition via paired adversarial learning, MGAT: multi-view graph attention networks, An individual claims reserving model for reported claims, Three-dimensional structural geological modeling using graph neural networks, Clustering-based simultaneous forecasting of life expectancy time series through Long-Short Term Memory Neural Networks, TRU-NET: a deep learning approach to high resolution prediction of rainfall, Deep learning and multivariate time series for cheat detection in video games, Topological measurement of deep neural networks using persistent homology, Fusing sufficient dimension reduction with neural networks, Deep regularization and direct training of the inner layers of neural networks with kernel flows, A full stage data augmentation method in deep convolutional neural network for natural image classification, Default risk prediction and feature extraction using a penalized deep neural network, Deep learning of chroma representation for cover song identification in compression domain, Upscaling of two-phase discrete fracture simulations using a convolutional neural network, RLF-LPI: an ensemble learning framework using sequence information for predicting lncRNA-protein interaction based on AE-ResLSTM and fuzzy decision, An ensemble framework based on deep CNNs architecture for glaucoma classification using fundus photography, Multifidelity data fusion in convolutional encoder/decoder networks, Adaptive infinite dropout for noisy and sparse data streams, Machine learning for corporate default risk: multi-period prediction, frailty correlation, loan portfolios, and tail probabilities, A unified neural network framework for extended redundancy analysis, A phase transition for finding needles in nonlinear haystacks with LASSO artificial neural networks, Extending business failure prediction models with textual website content using deep learning, Do ideas have shape? Idea registration as the continuous limit of artificial neural networks, Heavy-tails and randomized restarting beam search in goal-oriented neural sequence decoding

Uses Software