Показаны сообщения с ярлыком LSPI. Показать все сообщения
Показаны сообщения с ярлыком LSPI. Показать все сообщения

Online LSPI

четверг, 11 апреля 2013 г.
  • Li, L., Littman, M., L.& Mansley, C., R.(2009) Online Exploration in LSPI (slides,article,techreport)
  • Busoniu, L., Ernst, D., De Schutter, B., & Babuˇska, R.(2010) Online LSPI for RL control.  In proceeding of: American Control Conference (ACC), 2010. (proceedings)
  • Busoniu, L., De Schutter, B.,  Babuˇska, R., & Ernst, D. (2010) Using prior knowledge to accelerate online LSPI (article?)
  • Bu¸soniu,  L., De Schutter,  B., Babuˇska, R., & Ernst,  D. (2010) Exploiting policy
    knowledge in online LSPI: An empirical study. Automation, Computers, Applied Mathematics, 19(4), pp. 521–529. (techreport)

LSPI

среда, 10 апреля 2013 г.


1. Fern, A., Batch RL Via LSPI (slides)
2. Elkan, C. (2012) Least squares policy iteration (LSPI) (article)
3. Busoniu, L., Lazaric, A., Ghavamzadeh,  M., Munos, R., Babuska, R., and De Schutter, B. (2011) Least-squares methods for policy iteration. Reinforcement Learning: State of the Art. Springer. (article)
4. Lagoudakis, M., G., Parr, R., (2003) LSPI (article, code)
5. Lagoudakis, M., G., Parr, R., (2001) Model-Free LSPI (proceeding)
  • Lazaric,  A., Ghavamzadeh, M. & Munos, R. (2011) Finite-sample analysis of least-squares policy iteration. Journal of Machine learning Research, 13:3041-3074.
  • Xu, X., Hu D., & Lu, X. (2007) Kernel-Based LSPI for RL (article)
  • Ma, J., & Powell, W. B. (2009) Convergence Proofs of LSPI Algorithm for High-Dimensional Infinite Horizon Markov Decision Process Problems (article)
  • Thiery, C., & Scherrer, B. (2010) Least-Squares lambda Policy Iteration: Bias-Variance Trade-off in Control Problems (article)