Input perturbations for adaptive control and learning

DOI10.1016/J.AUTOMATICA.2020.108950zbMATH Open1441.93148arXiv1811.04258OpenAlexW3015500735MaRDI QIDQ2184498FDOQ2184498

Authors: Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

Publication date: 29 May 2020

Published in: Automatica (Search for Journal in Brave)

Abstract: This paper studies adaptive algorithms for simultaneous regulation (i.e., control) and estimation (i.e., learning) of Multiple Input Multiple Output (MIMO) linear dynamical systems. It proposes practical, easy to implement control policies based on perturbations of input signals. Such policies are shown to achieve a worst-case regret that scales as the square-root of the time horizon, and holds uniformly over time. Further, it discusses specific settings where such greedy policies attain the information theoretic lower bound of logarithmic regret. To establish the results, recent advances on self-normalized martingales together with a novel method of policy decomposition are leveraged.

Full work available at URL: https://arxiv.org/abs/1811.04258

Recommendations

zbMATH Keywords

system identification linear-quadratic decision-making under uncertainty exploration-exploitation adaptive LQRs finite-time optimality greedy policies

Mathematics Subject Classification ID

Linear systems in control theory (93C05) Multivariable systems, multidimensional control systems (93C35) Adaptive control/observation systems (93C40) Perturbations in control/observation systems (93C73)

Cites Work

Cited In (8)

This page was built for publication: Input perturbations for adaptive control and learning

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2184498)