• English
    • français
  • Help
  •  | 
  • Contact
  •  | 
  • About
  •  | 
  • Login
  • HAL portal
  •  | 
  • Pages Pro
  • EN
  •  / 
  • FR
View Item 
  •   LillOA Home
  • Liste des unités
  • Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
  • View Item
  •   LillOA Home
  • Liste des unités
  • Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Optimistic planning for belief-augmented ...
  • BibTeX
  • CSV
  • Excel
  • RIS

Document type :
Communication dans un congrès avec actes
Title :
Optimistic planning for belief-augmented Markov decision processes
Author(s) :
Fonteneau, Raphael [Auteur]
Department of Electrical Engineering and Computer Science [Institut Montefiore]
Busoniu, Lucian [Auteur]
Centre de Recherche en Automatique de Nancy [CRAN]
Munos, Rémi [Auteur]
Sequential Learning [SEQUEL]
Conference title :
IEEE International Symposium on Adaptive Dynamic Programming and reinforcement Learning, ADPRL 2013
City :
Singapour
Country :
Singapour
Start date of the conference :
2013-04-16
Book title :
IEEE International Symposium on Adaptive Dynamic Programming and reinforcement Learning, ADPRL 2013
Publication date :
2013-04-16
HAL domain(s) :
Informatique [cs]/Intelligence artificielle [cs.AI]
English abstract : [en]
This paper presents the Bayesian Optimistic Planning (BOP) algorithm, a novel model-based Bayesian reinforcement learning approach. BOP extends the planning approach of the Optimistic Planning for Markov Decision Processes ...
Show more >
This paper presents the Bayesian Optimistic Planning (BOP) algorithm, a novel model-based Bayesian reinforcement learning approach. BOP extends the planning approach of the Optimistic Planning for Markov Decision Processes (OP-MDP) algorithm [10], [9] to contexts where the transition model of the MDP is initially unknown and progressively learned through interactions within the environment. The knowledge about the unknown MDP is represented with a probability distribution over all possible transition models using Dirichlet distributions, and the BOP algorithm plans in the belief-augmented state space constructed by concatenating the original state vector with the current posterior distribution over transition models. We show that BOP becomes Bayesian optimal when the budget parameter increases to infinity. Preliminary empirical validations show promising performance.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
Collections :
  • Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
Source :
Harvested from HAL
Files
Thumbnail
  • https://hal.archives-ouvertes.fr/hal-00840202/document
  • Open access
  • Access the document
Thumbnail
  • https://hal.archives-ouvertes.fr/hal-00840202/document
  • Open access
  • Access the document
Thumbnail
  • https://hal.archives-ouvertes.fr/hal-00840202/document
  • Open access
  • Access the document
Université de Lille

Mentions légales
Université de Lille © 2017