Learning control of finite Markov chains with an explicit trade-off between estimation and control

From MaRDI portal

Revision as of 16:01, 5 February 2024 by Import240129110113 (talk | contribs) (Created automatically from import240129110113)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:3828969

Jump to:navigation, search

DOI10.1109/21.21595zbMath0674.65036OpenAlexW2015667537MaRDI QIDQ3828969

Hiroshi Takeda, Mitsuo Sato, Ken-Ichi Abe

Publication date: 1988

Published in: IEEE Transactions on Systems, Man, and Cybernetics (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1109/21.21595

zbMATH Keywords

stochastic control control parameter finite Markov chains control policy performance criterion asymptotic optimization frequency coefficient large size models learning control problem

Mathematics Subject Classification ID

Numerical optimization and variational techniques (65K10) Estimation and detection in stochastic control theory (93E10) Markov chains (discrete-time Markov processes on discrete state spaces) (60J10) Optimal stochastic control (93E20)

Related Items (3)

An incremental off-policy search in a model-free Markov decision process using a single sample path ⋮ A job scheduling approach based on a learning automaton for a distributed computing system ⋮ \({\mathcal Q}\)-learning

This page was built for publication: Learning control of finite Markov chains with an explicit trade-off between estimation and control

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:3828969&oldid=17420170"