Multi-agent reinforcement learning: a selective overview of theories and algorithms (Q2094040): Difference between revisions

From MaRDI portal
Set OpenAlex properties.
ReferenceBot (talk | contribs)
Changed an Item
 
Property / cites work
 
Property / cites work: Superhuman AI for multiplayer poker / rank
 
Normal rank
Property / cites work
 
Property / cites work: Distributed learning and cooperative control for multi-agent systems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Dynamic Potential Games With Constraints: Fundamentals and Applications in Communications / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Concise Introduction to Decentralized POMDPs / rank
 
Normal rank
Property / cites work
 
Property / cites work: Decentralized Q-Learning for Stochastic Teams and Games / rank
 
Normal rank
Property / cites work
 
Property / cites work: State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3376698 / rank
 
Normal rank
Property / cites work
 
Property / cites work: \({\mathcal Q}\)-learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Convergence results for single-step on-policy reinforcement-learning algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: An Adaptive Sampling Algorithm for Solving Markov Decision Processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Sample mean based index policies by <i>O</i>(log <i>n</i>) regret for the multi-armed bandit problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: Finite-time analysis of the multiarmed bandit problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4626283 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q2934010 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Simple statistical gradient-following algorithms for connectionist reinforcement learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4533362 / rank
 
Normal rank
Property / cites work
 
Property / cites work: OnActor-Critic Algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Natural actor-critic algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic Games / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4226167 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4382287 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Decomposition of dynamic team decision problems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Discrete–Time Stochastic Control and Dynamic Potential Games / rank
 
Normal rank
Property / cites work
 
Property / cites work: ${{\cal Q} {\cal D}}$-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through ${\rm Consensus} + {\rm Innovations}$ / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games / rank
 
Normal rank
Property / cites work
 
Property / cites work: \(H^ \infty\)-optimal control and related minimax design problems. A dynamic game approach. / rank
 
Normal rank
Property / cites work
 
Property / cites work: 10.1162/1532443041827880 / rank
 
Normal rank
Property / cites work
 
Property / cites work: The Complexity of Decentralized Control of Markov Decision Processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3576736 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Multiagent Systems / rank
 
Normal rank
Property / cites work
 
Property / cites work: The complexity of two-person zero-sum games in extensive form / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5817864 / rank
 
Normal rank
Property / cites work
 
Property / cites work: AWESOME: a general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents / rank
 
Normal rank
Property / cites work
 
Property / cites work: If multi-agent learning is the answer, what is the question? / rank
 
Normal rank
Property / cites work
 
Property / cites work: Multiagent learning using a variable learning rate / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3524713 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4552708 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5148965 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimal Decentralized Control of Coupled Subsystems With Control Sharing / rank
 
Normal rank
Property / cites work
 
Property / cites work: Distributed Policy Evaluation Under Multiple Behavior Strategies / rank
 
Normal rank
Property / cites work
 
Property / cites work: Finite-Time Performance of Distributed Temporal-Difference Learning with Linear Function Approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: The Evolution of Conventions / rank
 
Normal rank
Property / cites work
 
Property / cites work: Potential games / rank
 
Normal rank
Property / cites work
 
Property / cites work: Handbook of Dynamic Game Theory / rank
 
Normal rank
Property / cites work
 
Property / cites work: Large population stochastic dynamic games: closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle / rank
 
Normal rank
Property / cites work
 
Property / cites work: Mean field games / rank
 
Normal rank
Property / cites work
 
Property / cites work: Mean Field Games and Mean Field Type Control Theory / rank
 
Normal rank
Property / cites work
 
Property / cites work: Risk-Sensitive Mean-Field Games / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic networked control systems. Stabilization and optimization under information constraints / rank
 
Normal rank
Property / cites work
 
Property / cites work: Distributed learning of average belief over networks using sequential observations / rank
 
Normal rank
Property / cites work
 
Property / cites work: Distributed Subgradient Methods for Multi-Agent Optimization / rank
 
Normal rank
Property / cites work
 
Property / cites work: Cooperative Convex Optimization in Networked Systems: Augmented Lagrangian Algorithms With Directed Gossip Communication / rank
 
Normal rank
Property / cites work
 
Property / cites work: Diffusion Strategies Outperform Consensus Strategies for Distributed Estimation Over Adaptive Networks / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4969098 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q2810885 / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Distributed Actor-Critic Algorithm and Applications to Mobile Sensor Network Coordination Problems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic Proximal Gradient Consensus Over Random Networks / rank
 
Normal rank
Property / cites work
 
Property / cites work: Achieving Geometric Convergence for Distributed Optimization Over Time-Varying Graphs / rank
 
Normal rank
Property / cites work
 
Property / cites work: Performance Bounds in $L_p$‐norm for Approximate Value Iteration / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3096132 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path / rank
 
Normal rank
Property / cites work
 
Property / cites work: Harnessing Smoothness to Accelerate Distributed Optimization / rank
 
Normal rank
Property / cites work
 
Property / cites work: Minimizing finite sums with the stochastic average gradient / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5477862 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Distributed Stochastic Approximation: Weak Convergence and Network Design / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3624177 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimally Solving Dec-POMDPs as Continuous-State MDPs / rank
 
Normal rank
Property / cites work
 
Property / cites work: Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4027186 / rank
 
Normal rank
Property / cites work
 
Property / cites work: The Complexity of Computing a Nash Equilibrium / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3433855 / rank
 
Normal rank
Property / cites work
 
Property / cites work: On Nonterminating Stochastic Games / rank
 
Normal rank
Property / cites work
 
Property / cites work: Discounted Markov games: Generalized policy iteration method / rank
 
Normal rank
Property / cites work
 
Property / cites work: Algorithms for discounted stochastic games / rank
 
Normal rank
Property / cites work
 
Property / cites work: Strategy Iteration Is Strongly Polynomial for 2-Player Turn-Based Stochastic Games with a Constant Discount Factor / rank
 
Normal rank
Property / cites work
 
Property / cites work: Model-free \(Q\)-learning designs for linear discrete-time zero-sum games with application to \(H^\infty\) control / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q2834459 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q2896090 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Fast algorithms for finding randomized strategies in game trees / rank
 
Normal rank
Property / cites work
 
Property / cites work: Efficient computation of behavior strategies / rank
 
Normal rank
Property / cites work
 
Property / cites work: Efficient computation of equilibria for extensive two-person games / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4506458 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3245635 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5808755 / rank
 
Normal rank
Property / cites work
 
Property / cites work: An iterative method of solving a game / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic Approximations and Differential Inclusions / rank
 
Normal rank
Property / cites work
 
Property / cites work: A general class of adaptive strategies / rank
 
Normal rank
Property / cites work
 
Property / cites work: Belief affirming in learning processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: No-regret dynamics and fictitious play / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4421713 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Consistency and cautious fictitious play / rank
 
Normal rank
Property / cites work
 
Property / cites work: On the Global Convergence of Stochastic Fictitious Play / rank
 
Normal rank
Property / cites work
 
Property / cites work: Generalised weakened fictitious play / rank
 
Normal rank
Property / cites work
 
Property / cites work: Consistency of Vanishingly Smooth Fictitious Play / rank
 
Normal rank
Property / cites work
 
Property / cites work: Sampled fictitious play is Hannan consistent / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3093261 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic approximation. A dynamical systems viewpoint. / rank
 
Normal rank
Property / cites work
 
Property / cites work: Prediction, Learning, and Games / rank
 
Normal rank
Property / cites work
 
Property / cites work: The Nonstochastic Multiarmed Bandit Problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: The weighted majority algorithm / rank
 
Normal rank
Property / cites work
 
Property / cites work: Adaptive game playing using multiplicative weights / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Simple Adaptive Procedure Leading to Correlated Equilibrium / rank
 
Normal rank
Property / cites work
 
Property / cites work: Revisiting CFR+ and Alternating Updates / rank
 
Normal rank
Property / cites work
 
Property / cites work: Analysis of Hannan consistent selection for Monte Carlo tree search in simultaneous move games / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5381139 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Settling the complexity of computing two-player Nash equilibria / rank
 
Normal rank
Property / cites work
 
Property / cites work: Subjectivity and correlation in randomized strategies / rank
 
Normal rank
Property / cites work
 
Property / cites work: Markov--Nash Equilibria in Mean-Field Games with Discounted Cost / rank
 
Normal rank
Property / cites work
 
Property / cites work: Approximate Nash Equilibria in Partially Observed Stochastic Games with Mean-Field Interactions / rank
 
Normal rank
Property / cites work
 
Property / cites work: Discrete-time average-cost mean-field games on Polish spaces / rank
 
Normal rank
Property / cites work
 
Property / cites work: Approximate Markov-Nash Equilibria for Discrete-Time Risk-Sensitive Mean-Field Games / rank
 
Normal rank
Property / cites work
 
Property / cites work: Finite mean field games: fictitious play and convergence to a first order continuous mean field game / rank
 
Normal rank
Property / cites work
 
Property / cites work: Value iteration algorithm for mean-field games / rank
 
Normal rank
Property / cites work
 
Property / cites work: A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play / rank
 
Normal rank
Property / cites work
 
Property / cites work: The challenge of poker / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5801632 / rank
 
Normal rank
Property / cites work
 
Property / cites work: DeepStack: Expert-level artificial intelligence in heads-up no-limit poker / rank
 
Normal rank
Property / cites work
 
Property / cites work: Superhuman AI for heads-up no-limit poker: Libratus beats top professionals / rank
 
Normal rank
Property / cites work
 
Property / cites work: A near-optimal polynomial time algorithm for learning in certain classes of stochastic games / rank
 
Normal rank
Property / cites work
 
Property / cites work: 10.1162/153244303765208377 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5744808 / rank
 
Normal rank

Latest revision as of 16:31, 30 July 2024

scientific article
Language Label Description Also known as
English
Multi-agent reinforcement learning: a selective overview of theories and algorithms
scientific article

    Statements

    Multi-agent reinforcement learning: a selective overview of theories and algorithms (English)
    0 references
    0 references
    0 references
    0 references
    28 October 2022
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references

    Identifiers