Budgeted Reinforcement Learning in Continuous ...
Type de document :
Communication dans un congrès avec actes
Titre :
Budgeted Reinforcement Learning in Continuous State Space
Auteur(s) :
Carrara, Nicolas [Auteur]
Sequential Learning [SEQUEL]
Leurent, Edouard [Auteur]
RENAULT
Sequential Learning [SEQUEL]
Laroche, Romain [Auteur]
Urvoy, Tanguy [Auteur]
Orange Labs [Lannion]
Maillard, Odalric Ambrym [Auteur]
Sequential Learning [SEQUEL]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Pietquin, Olivier [Auteur]
Sequential Learning [SEQUEL]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Sequential Learning [SEQUEL]
Leurent, Edouard [Auteur]
RENAULT
Sequential Learning [SEQUEL]
Laroche, Romain [Auteur]
Urvoy, Tanguy [Auteur]
Orange Labs [Lannion]
Maillard, Odalric Ambrym [Auteur]

Sequential Learning [SEQUEL]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Pietquin, Olivier [Auteur]
Sequential Learning [SEQUEL]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Titre de la manifestation scientifique :
Conference on Neural Information Processing Systems
Ville :
Vancouver
Pays :
Canada
Date de début de la manifestation scientifique :
2019-12
Titre de la revue :
Advances in Neural Information Processing Systems
Discipline(s) HAL :
Statistiques [stat]/Machine Learning [stat.ML]
Résumé en anglais : [en]
A Budgeted Markov Decision Process (BMDP) is an extension of a Markov Decision Process to critical applications requiring safety constraints. It relies on a notion of risk implemented in the shape of a cost signal constrained ...
Lire la suite >A Budgeted Markov Decision Process (BMDP) is an extension of a Markov Decision Process to critical applications requiring safety constraints. It relies on a notion of risk implemented in the shape of a cost signal constrained to lie below an-adjustable-threshold. So far, BMDPs could only be solved in the case of finite state spaces with known dynamics. This work extends the state-of-the-art to continuous spaces environments and unknown dynamics. We show that the solution to a BMDP is a fixed point of a novel Budgeted Bellman Optimality operator. This observation allows us to introduce natural extensions of Deep Reinforcement Learning algorithms to address large-scale BMDPs. We validate our approach on two simulated applications: spoken dialogue and autonomous driving.Lire moins >
Lire la suite >A Budgeted Markov Decision Process (BMDP) is an extension of a Markov Decision Process to critical applications requiring safety constraints. It relies on a notion of risk implemented in the shape of a cost signal constrained to lie below an-adjustable-threshold. So far, BMDPs could only be solved in the case of finite state spaces with known dynamics. This work extends the state-of-the-art to continuous spaces environments and unknown dynamics. We show that the solution to a BMDP is a fixed point of a novel Budgeted Bellman Optimality operator. This observation allows us to introduce natural extensions of Deep Reinforcement Learning algorithms to address large-scale BMDPs. We validate our approach on two simulated applications: spoken dialogue and autonomous driving.Lire moins >
Langue :
Anglais
Comité de lecture :
Oui
Audience :
Internationale
Vulgarisation :
Non
Collections :
Source :
Fichiers
- https://hal.archives-ouvertes.fr/hal-02375727/document
- Accès libre
- Accéder au document
- http://arxiv.org/pdf/1903.01004
- Accès libre
- Accéder au document
- https://hal.archives-ouvertes.fr/hal-02375727/document
- Accès libre
- Accéder au document
- https://hal.archives-ouvertes.fr/hal-02375727/document
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- 9128-budgeted-reinforcement-learning-in-continuous-state-space.pdf
- Accès libre
- Accéder au document
- 1903.01004
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- 9128-budgeted-reinforcement-learning-in-continuous-state-space.pdf
- Accès libre
- Accéder au document