• English
    • français
  • Help
  •  | 
  • Contact
  •  | 
  • About
  •  | 
  • Login
  • HAL portal
  •  | 
  • Pages Pro
  • EN
  •  / 
  • FR
View Item 
  •   LillOA Home
  • Liste des unités
  • Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
  • View Item
  •   LillOA Home
  • Liste des unités
  • Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

A Fitted-Q Algorithm for Budgeted MDPs
  • BibTeX
  • CSV
  • Excel
  • RIS

Document type :
Autre communication scientifique (congrès sans actes - poster - séminaire...)
Title :
A Fitted-Q Algorithm for Budgeted MDPs
Author(s) :
Carrara, Nicolas [Auteur]
Sequential Learning [SEQUEL]
Orange Labs [Lannion]
Laroche, Romain [Auteur]
Maluuba
Bouraoui, Jean-Léon [Auteur]
Orange Labs [Lannion]
Urvoy, Tanguy [Auteur]
Orange Labs [Lannion]
Pietquin, Olivier [Auteur] refId
Sequential Learning [SEQUEL]
SUPELEC-Campus Metz
Publication date :
2018-08
HAL domain(s) :
Informatique [cs]/Intelligence artificielle [cs.AI]
English abstract : [en]
We address the problem of bud-geted/constrained reinforcement learning in continuous state-space using a batch of transitions. For this purpose, we introduce a novel algorithm called Budgeted Fitted-Q (BFTQ). We carry out ...
Show more >
We address the problem of bud-geted/constrained reinforcement learning in continuous state-space using a batch of transitions. For this purpose, we introduce a novel algorithm called Budgeted Fitted-Q (BFTQ). We carry out some preliminary benchmarks on a continuous 2-D world. They show that BFTQ performs as well as a penalized Fitted-Q algorithm while also allowing ones to adapt the trained policy on-the-fly for a given amount of budget and without the need of engineering the reward penalties. We believe that the general principles used to design BFTQ could be used to extend others classical reinforcement learning algorithms to budget-oriented applications.Show less >
Language :
Anglais
Popular science :
Non
Collections :
  • Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
Source :
Harvested from HAL
Files
Thumbnail
  • https://hal.archives-ouvertes.fr/hal-01867353/document
  • Open access
  • Access the document
Thumbnail
  • https://hal.archives-ouvertes.fr/hal-01867353/document
  • Open access
  • Access the document
Thumbnail
  • https://hal.archives-ouvertes.fr/hal-01867353/document
  • Open access
  • Access the document
Thumbnail
  • document
  • Open access
  • Access the document
Thumbnail
  • ncarrara-saferl-uai-2018.pdf
  • Open access
  • Access the document
Thumbnail
  • document
  • Open access
  • Access the document
Thumbnail
  • ncarrara-saferl-uai-2018.pdf
  • Open access
  • Access the document
Université de Lille

Mentions légales
Accessibilité : non conforme
Université de Lille © 2017