Model selection in reinforcement learning (Q415618): Difference between revisions

@@ Property / Mathematics Subject Classification ID @@
+T05
@@ Property / Mathematics Subject Classification ID: 68T05 / rank @@
+Normal rank
@@ Property / Mathematics Subject Classification ID @@
+Q32
@@ Property / Mathematics Subject Classification ID: 68Q32 / rank @@
+Normal rank
@@ Property / Mathematics Subject Classification ID @@
+C40
@@ Property / Mathematics Subject Classification ID: 90C40 / rank @@
+Normal rank
@@ Property / zbMATH DE Number @@
+6031868
@@ Property / zbMATH DE Number: 6031868 / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+reinforcement learning
@@ Property / zbMATH Keywords: reinforcement learning / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+model selection
@@ Property / zbMATH Keywords: model selection / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+complexity regularization
@@ Property / zbMATH Keywords: complexity regularization / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+adaptivity
@@ Property / zbMATH Keywords: adaptivity / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+offline learning
@@ Property / zbMATH Keywords: offline learning / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+off-policy learning
@@ Property / zbMATH Keywords: off-policy learning / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+finite-sample bounds
@@ Property / zbMATH Keywords: finite-sample bounds / rank @@
+Normal rank
@@ Property / describes a project that uses @@
+R-MAX
@@ Property / describes a project that uses: R-MAX / rank @@
+Normal rank
@@ Property / describes a project that uses @@
+ElemStatLearn
@@ Property / describes a project that uses: ElemStatLearn / rank @@
+Normal rank
@@ Property / describes a project that uses @@
+PRMLT
@@ Property / describes a project that uses: PRMLT / rank @@
+Normal rank
@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / full work available at URL @@
+https://doi.org/10.1007/s10994-011-5254-7
+Normal rank
@@ Property / OpenAlex ID @@
+W2006330826
@@ Property / OpenAlex ID: W2006330826 / rank @@
+Normal rank
@@ Property / cites work @@
+Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
+Normal rank
@@ Property / cites work @@
+A survey of cross-validation procedures for model selection
+Normal rank
@@ Property / cites work @@
+Q3973919
@@ Property / cites work: Q3973919 / rank @@
+Normal rank
@@ Property / cites work @@
+Model selection and error estimation
@@ Property / cites work: Model selection and error estimation / rank @@
+Normal rank
@@ Property / cites work @@
+Local Rademacher complexities
@@ Property / cites work: Local Rademacher complexities / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic optimal control. The discrete time case
+Normal rank
@@ Property / cites work @@
+Q4257216
@@ Property / cites work: Q4257216 / rank @@
+Normal rank
@@ Property / cites work @@
+Q5483032
@@ Property / cites work: Q5483032 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3093261
@@ Property / cites work: Q3093261 / rank @@
+Normal rank
@@ Property / cites work @@
+Memory-universal prediction of stationary random processes
+Normal rank
@@ Property / cites work @@
+A distribution-free theory of nonparametric regression
+Normal rank
@@ Property / cites work @@
+The elements of statistical learning. Data mining, inference, and prediction
+Normal rank
@@ Property / cites work @@
+.1162/1532443041827907
@@ Property / cites work: 10.1162/1532443041827907 / rank @@
+Normal rank
@@ Property / cites work @@
+Complexity regularization via localized random penalties
+Normal rank
@@ Property / cites work @@
+Nonparametric time series prediction through adaptive model selection
+Normal rank
@@ Property / cites work @@
+Basis function adaptation in temporal difference reinforcement learning
+Normal rank
@@ Property / cites work @@
+Markov Chains and Stochastic Stability
@@ Property / cites work: Markov Chains and Stochastic Stability / rank @@
+Normal rank
@@ Property / cites work @@
+Q3394879
@@ Property / cites work: Q3394879 / rank @@
+Normal rank
@@ Property / cites work @@
+Concentration of measure inequalities for Markov chains and \(\Phi\)-mixing processes.
+Normal rank
@@ Property / cites work @@
+Algorithms for Reinforcement Learning
@@ Property / cites work: Algorithms for Reinforcement Learning / rank @@
+Normal rank
@@ Property / cites work @@
+Q3655724
@@ Property / cites work: Q3655724 / rank @@
+Normal rank
@@ Property / cites work @@
+Oracle inequalities for multi-fold cross validation
+Normal rank
@@ Property / cites work @@
+Model selection in nonparametric regression
@@ Property / cites work: Model selection in nonparametric regression / rank @@
+Normal rank
@@ links / mardi / name / links / mardi / name @@
+Publication:415618