Solving Bernoulli Rank-One Bandits with ...
Document type :
Communication dans un congrès avec actes
Title :
Solving Bernoulli Rank-One Bandits with Unimodal Thompson Sampling
Author(s) :
Trinh, Cindy [Auteur]
Ecole Normale Supérieure Paris-Saclay [ENS Paris Saclay]
Kaufmann, Emilie [Auteur]
Scool [Scool]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Centre National de la Recherche Scientifique [CNRS]
Sequential Learning [SEQUEL]
Vernade, Claire [Auteur]
DeepMind [London]
Combes, Richard [Auteur]
Laboratoire des signaux et systèmes [L2S]
CentraleSupélec
Ecole Normale Supérieure Paris-Saclay [ENS Paris Saclay]
Kaufmann, Emilie [Auteur]

Scool [Scool]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Centre National de la Recherche Scientifique [CNRS]
Sequential Learning [SEQUEL]
Vernade, Claire [Auteur]
DeepMind [London]
Combes, Richard [Auteur]
Laboratoire des signaux et systèmes [L2S]
CentraleSupélec
Conference title :
ALT 2020 - 31st International Conference on Algorithmic Learning Theory
City :
San Diego
Country :
Etats-Unis d'Amérique
Start date of the conference :
2020-02-08
Publication date :
2020-02-08
English keyword(s) :
Multi-armed bandits
unimodal bandits
rank-one bandits
unimodal bandits
rank-one bandits
HAL domain(s) :
Statistiques [stat]/Machine Learning [stat.ML]
English abstract : [en]
Stochastic Rank-One Bandits (Katarya et al, (2017a,b)) are a simple framework for regret minimization problems over rank-one matrices of arms. The initially proposed algorithms are proved to have logarithmic regret, but ...
Show more >Stochastic Rank-One Bandits (Katarya et al, (2017a,b)) are a simple framework for regret minimization problems over rank-one matrices of arms. The initially proposed algorithms are proved to have logarithmic regret, but do not match the existing lower bound for this problem. We close this gap by first proving that rank-one bandits are a particular instance of unimodal bandits, and then providing a new analysis of Unimodal Thompson Sampling (UTS), initially proposed by Paladino et al (2017). We prove an asymptotically optimal regret bound on the frequentist regret of UTS and we support our claims with simulations showing the significant improvement of our method compared to the state-of-the-art.Show less >
Show more >Stochastic Rank-One Bandits (Katarya et al, (2017a,b)) are a simple framework for regret minimization problems over rank-one matrices of arms. The initially proposed algorithms are proved to have logarithmic regret, but do not match the existing lower bound for this problem. We close this gap by first proving that rank-one bandits are a particular instance of unimodal bandits, and then providing a new analysis of Unimodal Thompson Sampling (UTS), initially proposed by Paladino et al (2017). We prove an asymptotically optimal regret bound on the frequentist regret of UTS and we support our claims with simulations showing the significant improvement of our method compared to the state-of-the-art.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
ANR Project :
Collections :
Source :
Files
- https://hal.archives-ouvertes.fr/hal-02396943v2/document
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-02396943/file/128-128-all.png
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-02396943/file/16-16-log.png
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-02396943/file/32-32-all.png
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-02396943/file/4-4-all.jpg
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-02396943/file/4-4-log.png
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-02396943/file/64-64-all.png
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-02396943/file/8-8-log.png
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-02396943/file/closer-16-16.png
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-02396943/file/closer-32-32.png
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-02396943/file/closer-8-8.png
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-02396943/file/row_4_col_4_log.png
- Open access
- Access the document
- http://arxiv.org/pdf/1912.03074
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-02396943v2/document
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-02396943v2/document
- Open access
- Access the document
- document
- Open access
- Access the document
- Trinh20.pdf
- Open access
- Access the document
- 128-128-all.png
- Open access
- Access the document
- 16-16-log.png
- Open access
- Access the document
- 32-32-all.png
- Open access
- Access the document
- 4-4-all.jpg
- Open access
- Access the document
- 4-4-log.png
- Open access
- Access the document
- 64-64-all.png
- Open access
- Access the document
- 8-8-log.png
- Open access
- Access the document
- closer-16-16.png
- Open access
- Access the document
- closer-32-32.png
- Open access
- Access the document
- closer-8-8.png
- Open access
- Access the document
- 1912.03074
- Open access
- Access the document
- row_4_col_4_log.png
- Open access
- Access the document