Input perturbations for adaptive control and learning

From MaRDI portal
Publication:2184498

DOI10.1016/J.AUTOMATICA.2020.108950zbMATH Open1441.93148arXiv1811.04258OpenAlexW3015500735MaRDI QIDQ2184498FDOQ2184498


Authors: Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis Edit this on Wikidata


Publication date: 29 May 2020

Published in: Automatica (Search for Journal in Brave)

Abstract: This paper studies adaptive algorithms for simultaneous regulation (i.e., control) and estimation (i.e., learning) of Multiple Input Multiple Output (MIMO) linear dynamical systems. It proposes practical, easy to implement control policies based on perturbations of input signals. Such policies are shown to achieve a worst-case regret that scales as the square-root of the time horizon, and holds uniformly over time. Further, it discusses specific settings where such greedy policies attain the information theoretic lower bound of logarithmic regret. To establish the results, recent advances on self-normalized martingales together with a novel method of policy decomposition are leveraged.


Full work available at URL: https://arxiv.org/abs/1811.04258




Recommendations




Cites Work


Cited In (8)





This page was built for publication: Input perturbations for adaptive control and learning

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2184498)