- Li, L., Littman, M., L.& Mansley, C., R.(2009) Online Exploration in LSPI (slides,article,techreport)
- Busoniu, L., Ernst, D., De Schutter, B., & Babuˇska, R.(2010) Online LSPI for RL control. In proceeding of: American Control Conference (ACC), 2010. (proceedings)
- Busoniu, L., De Schutter, B., Babuˇska, R., & Ernst, D. (2010) Using prior knowledge to accelerate online LSPI (article?)
- Bu¸soniu, L., De Schutter, B., Babuˇska,
R., & Ernst, D. (2010) Exploiting policy
knowledge in online LSPI: An empirical study. Automation, Computers, Applied Mathematics, 19(4), pp. 521–529. (techreport)
Показаны сообщения с ярлыком LSPI. Показать все сообщения
Показаны сообщения с ярлыком LSPI. Показать все сообщения
Online LSPI
четверг, 11 апреля 2013 г.
Posted by Unknown
Ярлыки:
exploration,
LSPI,
online,
RL
LSPI
среда, 10 апреля 2013 г.
Posted by Unknown
1. Fern, A., Batch RL Via LSPI (slides)
2. Elkan, C. (2012) Least squares policy iteration (LSPI) (article)
3. Busoniu, L., Lazaric, A., Ghavamzadeh, M., Munos, R., Babuska, R., and De Schutter, B. (2011) Least-squares methods for policy iteration. Reinforcement Learning: State of the Art. Springer. (article)
4. Lagoudakis, M., G., Parr, R., (2003) LSPI (article, code)
5. Lagoudakis, M., G., Parr, R., (2001) Model-Free LSPI (proceeding)
- Lazaric, A., Ghavamzadeh, M. & Munos, R. (2011) Finite-sample analysis of least-squares policy iteration. Journal of Machine learning Research, 13:3041-3074.
- Xu, X., Hu D., & Lu, X. (2007) Kernel-Based LSPI for RL (article)
- Ma, J., & Powell, W. B. (2009) Convergence Proofs of LSPI Algorithm for High-Dimensional Infinite Horizon Markov Decision Process Problems (article)
- Thiery, C., & Scherrer, B. (2010) Least-Squares lambda Policy Iteration: Bias-Variance Trade-off in Control Problems (article)
Подписаться на:
Сообщения (Atom)