Efficient Change-Point Detection for ...
Type de document :
Compte-rendu et recension critique d'ouvrage
Titre :
Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits
Auteur(s) :
Besson, Lilian [Auteur]
École normale supérieure - Rennes [ENS Rennes]
Kaufmann, Emilie [Auteur]
Scool [Scool]
Maillard, Odalric Ambrym [Auteur]
Scool [Scool]
Seznec, Julien [Auteur]
Scool [Scool]
Lelivrescolaire.fr
École normale supérieure - Rennes [ENS Rennes]
Kaufmann, Emilie [Auteur]
![refId](/themes/Mirage2//images/idref.png)
Scool [Scool]
Maillard, Odalric Ambrym [Auteur]
![refId](/themes/Mirage2//images/idref.png)
Scool [Scool]
Seznec, Julien [Auteur]
Scool [Scool]
Lelivrescolaire.fr
Titre de la revue :
Journal of Machine Learning Research
Éditeur :
Microtome Publishing
Date de publication :
2022-03
ISSN :
1532-4435
Mot(s)-clé(s) en anglais :
Multi-Armed Bandits
Non-Stationary Bandits
Change Point Detection
Non-Stationary Bandits
Change Point Detection
Discipline(s) HAL :
Statistiques [stat]/Autres [stat.ML]
Résumé en anglais : [en]
We introduce GLR-klUCB, a novel algorithm for the piecewise iid non-stationary bandit problem with bounded rewards. This algorithm combines an efficient bandit algorithm, kl-UCB, with an efficient, parameter-free, changepoint ...
Lire la suite >We introduce GLR-klUCB, a novel algorithm for the piecewise iid non-stationary bandit problem with bounded rewards. This algorithm combines an efficient bandit algorithm, kl-UCB, with an efficient, parameter-free, changepoint detector, the Bernoulli Generalized Likelihood Ratio Test, for which we provide new theoretical guarantees of independent interest. Unlike previous non-stationary bandit algorithms using a change-point detector, GLR-klUCB does not need to be calibrated based on prior knowledge on the arms' means. We prove that this algorithm can attain a $O(\sqrt{TA \Upsilon_T\log(T)})$ regret in $T$ rounds on some ``easy'' instances, where A is the number of arms and $\Upsilon_T$ the number of change-points, without prior knowledge of $\Upsilon_T$. In contrast with recently proposed algorithms that are agnostic to $\Upsilon_T$, we perform a numerical study showing that GLR-klUCB is also very efficient in practice, beyond easy instances.Lire moins >
Lire la suite >We introduce GLR-klUCB, a novel algorithm for the piecewise iid non-stationary bandit problem with bounded rewards. This algorithm combines an efficient bandit algorithm, kl-UCB, with an efficient, parameter-free, changepoint detector, the Bernoulli Generalized Likelihood Ratio Test, for which we provide new theoretical guarantees of independent interest. Unlike previous non-stationary bandit algorithms using a change-point detector, GLR-klUCB does not need to be calibrated based on prior knowledge on the arms' means. We prove that this algorithm can attain a $O(\sqrt{TA \Upsilon_T\log(T)})$ regret in $T$ rounds on some ``easy'' instances, where A is the number of arms and $\Upsilon_T$ the number of change-points, without prior knowledge of $\Upsilon_T$. In contrast with recently proposed algorithms that are agnostic to $\Upsilon_T$, we perform a numerical study showing that GLR-klUCB is also very efficient in practice, beyond easy instances.Lire moins >
Langue :
Anglais
Vulgarisation :
Non
Projet ANR :
Collections :
Source :
Fichiers
- https://hal.inria.fr/hal-02006471v3/document
- Accès libre
- Accéder au document
- http://arxiv.org/pdf/1902.01575
- Accès libre
- Accéder au document
- https://hal.inria.fr/hal-02006471v3/document
- Accès libre
- Accéder au document
- https://hal.inria.fr/hal-02006471v3/document
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- BKMS22%20%281%29.pdf
- Accès libre
- Accéder au document
- 1902.01575
- Accès libre
- Accéder au document