Monte-Carlo Tree Search by Best Arm Identification
Type de document :
Communication dans un congrès avec actes
Titre :
Monte-Carlo Tree Search by Best Arm Identification
Auteur(s) :
Kaufmann, Emilie [Auteur]
Sequential Learning [SEQUEL]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Koolen, Wouter [Auteur]
Centrum Wiskunde & Informatica [CWI]
Sequential Learning [SEQUEL]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Koolen, Wouter [Auteur]
Centrum Wiskunde & Informatica [CWI]
Titre de la manifestation scientifique :
NIPS 2017 - 31st Annual Conference on Neural Information Processing Systems
Ville :
Long Beach
Pays :
Etats-Unis d'Amérique
Date de début de la manifestation scientifique :
2017-12-04
Titre de la revue :
Advances in Neural Information Processing Systems
Date de publication :
2017-12
Mot(s)-clé(s) en anglais :
best arm identification
multi-armed bandit
Monte-Carlo Tree Search
multi-armed bandit
Monte-Carlo Tree Search
Discipline(s) HAL :
Statistiques [stat]/Machine Learning [stat.ML]
Résumé en anglais : [en]
Recent advances in bandit tools and techniques for sequential learning are steadily enabling new applications and are promising the resolution of a range of challenging related problems. We study the game tree search ...
Lire la suite >Recent advances in bandit tools and techniques for sequential learning are steadily enabling new applications and are promising the resolution of a range of challenging related problems. We study the game tree search problem, where the goal is to quickly identify the optimal move in a given game tree by sequentially sampling its stochastic payoffs. We develop new algorithms for trees of arbitrary depth, that operate by summarizing all deeper levels of the tree into confidence intervals at depth one, and applying a best arm identification procedure at the root. We prove new sample complexity guarantees with a refined dependence on the problem instance. We show experimentally that our algorithms outperform existing elimination-based algorithms and match previous special-purpose methods for depth-two trees.Lire moins >
Lire la suite >Recent advances in bandit tools and techniques for sequential learning are steadily enabling new applications and are promising the resolution of a range of challenging related problems. We study the game tree search problem, where the goal is to quickly identify the optimal move in a given game tree by sequentially sampling its stochastic payoffs. We develop new algorithms for trees of arbitrary depth, that operate by summarizing all deeper levels of the tree into confidence intervals at depth one, and applying a best arm identification procedure at the root. We prove new sample complexity guarantees with a refined dependence on the problem instance. We show experimentally that our algorithms outperform existing elimination-based algorithms and match previous special-purpose methods for depth-two trees.Lire moins >
Langue :
Anglais
Comité de lecture :
Oui
Audience :
Internationale
Vulgarisation :
Non
Projet ANR :
Collections :
Source :
Fichiers
- https://hal.archives-ouvertes.fr/hal-01535907v2/document
- Accès libre
- Accéder au document
- http://arxiv.org/pdf/1706.02986
- Accès libre
- Accéder au document
- https://hal.archives-ouvertes.fr/hal-01535907v2/document
- Accès libre
- Accéder au document
- https://hal.archives-ouvertes.fr/hal-01535907v2/document
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- KK17arXiv.pdf
- Accès libre
- Accéder au document
- 1706.02986
- Accès libre
- Accéder au document