Optimistic planning for belief-augmented ...
Type de document :
Communication dans un congrès avec actes
Titre :
Optimistic planning for belief-augmented Markov decision processes
Auteur(s) :
Fonteneau, Raphael [Auteur]
Department of Electrical Engineering and Computer Science [Institut Montefiore]
Busoniu, Lucian [Auteur]
Centre de Recherche en Automatique de Nancy [CRAN]
Munos, Rémi [Auteur]
Sequential Learning [SEQUEL]
Department of Electrical Engineering and Computer Science [Institut Montefiore]
Busoniu, Lucian [Auteur]
Centre de Recherche en Automatique de Nancy [CRAN]
Munos, Rémi [Auteur]
Sequential Learning [SEQUEL]
Titre de la manifestation scientifique :
IEEE International Symposium on Adaptive Dynamic Programming and reinforcement Learning, ADPRL 2013
Ville :
Singapour
Pays :
Singapour
Date de début de la manifestation scientifique :
2013-04-16
Titre de l’ouvrage :
IEEE International Symposium on Adaptive Dynamic Programming and reinforcement Learning, ADPRL 2013
Date de publication :
2013-04-16
Discipline(s) HAL :
Informatique [cs]/Intelligence artificielle [cs.AI]
Résumé en anglais : [en]
This paper presents the Bayesian Optimistic Planning (BOP) algorithm, a novel model-based Bayesian reinforcement learning approach. BOP extends the planning approach of the Optimistic Planning for Markov Decision Processes ...
Lire la suite >This paper presents the Bayesian Optimistic Planning (BOP) algorithm, a novel model-based Bayesian reinforcement learning approach. BOP extends the planning approach of the Optimistic Planning for Markov Decision Processes (OP-MDP) algorithm [10], [9] to contexts where the transition model of the MDP is initially unknown and progressively learned through interactions within the environment. The knowledge about the unknown MDP is represented with a probability distribution over all possible transition models using Dirichlet distributions, and the BOP algorithm plans in the belief-augmented state space constructed by concatenating the original state vector with the current posterior distribution over transition models. We show that BOP becomes Bayesian optimal when the budget parameter increases to infinity. Preliminary empirical validations show promising performance.Lire moins >
Lire la suite >This paper presents the Bayesian Optimistic Planning (BOP) algorithm, a novel model-based Bayesian reinforcement learning approach. BOP extends the planning approach of the Optimistic Planning for Markov Decision Processes (OP-MDP) algorithm [10], [9] to contexts where the transition model of the MDP is initially unknown and progressively learned through interactions within the environment. The knowledge about the unknown MDP is represented with a probability distribution over all possible transition models using Dirichlet distributions, and the BOP algorithm plans in the belief-augmented state space constructed by concatenating the original state vector with the current posterior distribution over transition models. We show that BOP becomes Bayesian optimal when the budget parameter increases to infinity. Preliminary empirical validations show promising performance.Lire moins >
Langue :
Anglais
Comité de lecture :
Oui
Audience :
Internationale
Vulgarisation :
Non
Collections :
Source :
Fichiers
- https://hal.archives-ouvertes.fr/hal-00840202/document
- Accès libre
- Accéder au document
- https://hal.archives-ouvertes.fr/hal-00840202/document
- Accès libre
- Accéder au document
- https://hal.archives-ouvertes.fr/hal-00840202/document
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- adprl.pdf
- Accès libre
- Accéder au document