Optimal control with learning on the fly: a toy problem (Q832436)

From MaRDI portal

Jump to:navigation, search

!

WARNING

This is the item page for this Wikibase entity, intended for internal use and editing purposes.

Please use the normal view instead:

Optimal control with learning on the fly: a toy problem

scientific article; zbMATH DE number 7498312

Language	Label	Description	Also known as
default for all languages	No label defined
English	Optimal control with learning on the fly: a toy problem	scientific article; zbMATH DE number 7498312

Statements

scholarly article

0 references

Optimal control with learning on the fly: a toy problem (English)

0 references

Bernat Guillén Pegueroles

0 references

Clarence W. Rowley

0 references

0 references

Charles Fefferman

0 references

Revista Matemática Iberoamericana

0 references

publication date

25 March 2022

0 references

full work available at URL

https://arxiv.org/abs/2002.11578

0 references

Summary: We exhibit optimal control strategies for a simple toy problem in which the underlying dynamics depend on a parameter that is initially unknown and must be learned. We consider a cost function posed over a finite time interval, in contrast to much previous work that considers asymptotics as the time horizon tends to infinity. We study several different versions of the problem, including Bayesian control, in which we assume a prior distribution on the unknown parameter; and ``agnostic'' control, in which we assume nothing about the unknown parameter. For the agnostic problems, we compare our performance with that of an opponent who knows the value of the parameter. This comparison gives rise to several notions of ``regret'', and we obtain strategies that minimize the ``worst-case regret'' arising from the most unfavorable choice of the unknown parameter. In every case, the optimal strategy turns out to be a Bayesian strategy or a limit of Bayesian strategies.

0 references

zbMATH Keywords

regret

0 references

competitive ratio

0 references

agnostic control

0 references

adaptive control

0 references

fuel tax regret

0 references

MaRDI profile type

MaRDI publication profile

0 references

Finite-time analysis of the multiarmed bandit problem

0 references

Dynamic programming and optimal control. Vol. 1.

0 references

Regret analysis of stochastic and nonstochastic multi-armed bandit problems

0 references

Prediction, Learning, and Games

0 references

Asymptotically efficient adaptive allocation rules

0 references

Approximate dynamic programming. Solving the curses of dimensionality

0 references

Recommended article

Controlling unknown linear dynamics with bounded multiplicative regret

Similarity Score

0.8453490138053894

Recommender Run

Recommender Run 4

0 references

Optimal Control of an Unknown Linear Process with Learning

Similarity Score

0.843239426612854

Recommender Run

Recommender Run 4

0 references

Learning and control in a changing economic environment.

Similarity Score

0.7937319278717041

Recommender Run

Recommender Run 4

0 references

Bayes' learning of unknown parameters

Similarity Score

0.781589150428772

Recommender Run

Recommender Run 4

0 references

Similarity Score

0.7780045866966248

Recommender Run

Recommender Run 4

0 references

Identifiers

zbMATH Open document ID

0 references

10.4171/RMI/1275

0 references

Mathematics Subject Classification ID

0 references

zbMATH DE Number

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Optimal control with learning on the fly: a toy problem

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Item:Q832436&oldid=64710441"