• English
    • français
  • Help
  •  | 
  • Contact
  •  | 
  • About
  •  | 
  • Login
  • HAL portal
  •  | 
  • Pages Pro
  • EN
  •  / 
  • FR
View Item 
  •   LillOA Home
  • Liste des unités
  • Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
  • View Item
  •   LillOA Home
  • Liste des unités
  • Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Direct Policy Iteration with Demonstrations
  • BibTeX
  • CSV
  • Excel
  • RIS

Document type :
Communication dans un congrès avec actes
Title :
Direct Policy Iteration with Demonstrations
Author(s) :
Chemali, Jessica [Auteur]
Computer Science Department - Carnegie Mellon University
Lazaric, Alessandro [Auteur] refId
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Sequential Learning [SEQUEL]
Conference title :
IJCAI - 24th International Joint Conference on Artificial Intelligence
City :
Buenos Aires
Country :
Argentine
Start date of the conference :
2015-07-25
Publication date :
2015-07
HAL domain(s) :
Statistiques [stat]/Machine Learning [stat.ML]
English abstract : [en]
We consider the problem of learning the optimal policy of an unknown Markov decision process (MDP) when expert demonstrations are available along with interaction samples. We build on classification-based policy iteration ...
Show more >
We consider the problem of learning the optimal policy of an unknown Markov decision process (MDP) when expert demonstrations are available along with interaction samples. We build on classification-based policy iteration to perform a seamless integration of interaction and expert data, thus obtaining an algorithm which can benefit from both sources of information at the same time. Furthermore , we provide a full theoretical analysis of the performance across iterations providing insights on how the algorithm works. Finally, we report an empirical evaluation of the algorithm and a comparison with the state-of-the-art algorithms.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
ANR Project :
Extraction et transfert de connaissances dans l'apprentissage par renforcement
Collections :
  • Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
Source :
Harvested from HAL
Files
Thumbnail
  • https://hal.inria.fr/hal-01237659/document
  • Open access
  • Access the document
Thumbnail
  • https://hal.inria.fr/hal-01237659/document
  • Open access
  • Access the document
Thumbnail
  • https://hal.inria.fr/hal-01237659/document
  • Open access
  • Access the document
Thumbnail
  • document
  • Open access
  • Access the document
Thumbnail
  • DPID_CameraReady.pdf
  • Open access
  • Access the document
Thumbnail
  • document
  • Open access
  • Access the document
Thumbnail
  • DPID_CameraReady.pdf
  • Open access
  • Access the document
Université de Lille

Mentions légales
Accessibilité : non conforme
Université de Lille © 2017