Safe learning for near-optimal scheduling
From MaRDI portal
Abstract: In this paper, we investigate the combination of synthesis, model-based learning, and online sampling techniques to obtain safe and near-optimal schedulers for a preemptible task scheduling problem. Our algorithms can handle Markov decision processes (MDPs) that have 1020 states and beyond which cannot be handled with state-of-the art probabilistic model-checkers. We provide probably approximately correct (PAC) guarantees for learning the model. Additionally, we extend Monte-Carlo tree search with advice, computed using safety games or obtained using the earliest-deadline-first scheduler, to safely explore the learned model online. Finally, we implemented and compared our algorithms empirically against shielded deep Q-learning on large task systems.
Recommendations
- As soon as probable: optimal scheduling under stochastic uncertainty
- scientific article; zbMATH DE number 1956584
- Scheduling with timed automata
- Verification of Markov decision processes using learning algorithms
- Synthesising succinct strategies in safety games with an application to real-time scheduling
Cites work
- scientific article; zbMATH DE number 1134975 (Why is no real title available?)
- scientific article; zbMATH DE number 7559496 (Why is no real title available?)
- scientific article; zbMATH DE number 7561341 (Why is no real title available?)
- A sparse sampling algorithm for near-optimal planning in large Markov decision processes
- A theory of the learnable
- Continuity of the value of competitive Markov decision processes
- Hard real-time computing systems. Predictable scheduling algorithms and applications.
- Learning-based mean-payoff optimization in an unknown MDP under omega-regular constraints
- On the synthesis of strategies in infinite games
- Robustness of structurally equivalent concurrent parity games
- Stochastic games
- Supervisory Control of a Class of Discrete Event Processes
- \({\mathcal Q}\)-learning
This page was built for publication: Safe learning for near-optimal scheduling
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q832074)