Controller exploitation-exploration reinforcement learning architecture for computing near-optimal policies (Q2318167)

scientific article

Language	Label	Description	Also known as
English	Controller exploitation-exploration reinforcement learning architecture for computing near-optimal policies	scientific article

Statements

instance of

scholarly article

0 references

title

Controller exploitation-exploration reinforcement learning architecture for computing near-optimal policies (English)

0 references

0 references

0 references

0 references

0 references

14 August 2019

0 references

zbMATH Keywords

reinforcement learning

0 references

architecture

0 references

average cost

0 references

Markov chains

0 references

optimization

0 references

MaRDI profile type

MaRDI publication profile

0 references

full work available at URL

https://doi.org/10.1007/s00500-018-3225-7

0 references

cites work

A near-optimal polynomial time algorithm for learning in certain classes of stochastic games

0 references

10.1162/153244303765208377

0 references

Q2896090

0 references

Near-optimal reinforcement learning in polynomial time

0 references

Learning automata and stochastic optimization

0 references

Q4485809

0 references

Q4315289

0 references

Q4626283

0 references

Identifiers

zbMATH Open document ID

1418.93140

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

10.1007/S00500-018-3225-7

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:2318167