Best-Arm Identification in Linear Bandits
Document type :
Communication dans un congrès avec actes
Title :
Best-Arm Identification in Linear Bandits
Author(s) :
Soare, Marta [Auteur]
Laboratoire d'Informatique Fondamentale de Lille [LIFL]
Sequential Learning [SEQUEL]
Lazaric, Alessandro [Auteur]
Laboratoire d'Informatique Fondamentale de Lille [LIFL]
Sequential Learning [SEQUEL]
Munos, Rémi [Auteur]
Laboratoire d'Informatique Fondamentale de Lille [LIFL]
Sequential Learning [SEQUEL]
Laboratoire d'Informatique Fondamentale de Lille [LIFL]
Sequential Learning [SEQUEL]
Lazaric, Alessandro [Auteur]

Laboratoire d'Informatique Fondamentale de Lille [LIFL]
Sequential Learning [SEQUEL]
Munos, Rémi [Auteur]
Laboratoire d'Informatique Fondamentale de Lille [LIFL]
Sequential Learning [SEQUEL]
Conference title :
NIPS - Advances in Neural Information Processing Systems 27
City :
Montreal
Country :
Canada
Start date of the conference :
2014-12-08
Publication date :
2014-10-20
HAL domain(s) :
Informatique [cs]/Apprentissage [cs.LG]
Informatique [cs]/Algorithme et structure de données [cs.DS]
Informatique [cs]/Algorithme et structure de données [cs.DS]
English abstract : [en]
We study the best-arm identification problem in linear bandit, where the rewards of the arms depend linearly on an unknown parameter $\theta^*$ and the objective is to return the arm with the largest reward. We characterize ...
Show more >We study the best-arm identification problem in linear bandit, where the rewards of the arms depend linearly on an unknown parameter $\theta^*$ and the objective is to return the arm with the largest reward. We characterize the complexity of the problem and introduce sample allocation strategies that pull arms to identify the best arm with a fixed confidence, while minimizing the sample budget. In particular, we show the importance of exploiting the global linear structure to improve the estimate of the reward of near-optimal arms. We analyze the proposed strategies and compare their empirical performance. Finally, we point out the connection to the $G$-optimality criterion used in optimal experimental design.Show less >
Show more >We study the best-arm identification problem in linear bandit, where the rewards of the arms depend linearly on an unknown parameter $\theta^*$ and the objective is to return the arm with the largest reward. We characterize the complexity of the problem and introduce sample allocation strategies that pull arms to identify the best arm with a fixed confidence, while minimizing the sample budget. In particular, we show the importance of exploiting the global linear structure to improve the estimate of the reward of near-optimal arms. We analyze the proposed strategies and compare their empirical performance. Finally, we point out the connection to the $G$-optimality criterion used in optimal experimental design.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
European Project :
Collections :
Source :
Files
- https://hal.inria.fr/hal-01075701/document
- Open access
- Access the document
- https://hal.inria.fr/hal-01075701/document
- Open access
- Access the document
- document
- Open access
- Access the document
- best_linear_arm.pdf
- Open access
- Access the document
- document
- Open access
- Access the document
- best_linear_arm.pdf
- Open access
- Access the document