Efficient Change-Point Detection for ...
Document type :
Compte-rendu et recension critique d'ouvrage
Title :
Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits
Author(s) :
Besson, Lilian [Auteur]
École normale supérieure - Rennes [ENS Rennes]
Kaufmann, Emilie [Auteur]
Scool [Scool]
Maillard, Odalric Ambrym [Auteur]
Scool [Scool]
Seznec, Julien [Auteur]
Scool [Scool]
Lelivrescolaire.fr
École normale supérieure - Rennes [ENS Rennes]
Kaufmann, Emilie [Auteur]
Scool [Scool]
Maillard, Odalric Ambrym [Auteur]
Scool [Scool]
Seznec, Julien [Auteur]
Scool [Scool]
Lelivrescolaire.fr
Journal title :
Journal of Machine Learning Research
Publisher :
Microtome Publishing
Publication date :
2022-03
ISSN :
1532-4435
English keyword(s) :
Multi-Armed Bandits
Non-Stationary Bandits
Change Point Detection
Non-Stationary Bandits
Change Point Detection
HAL domain(s) :
Statistiques [stat]/Autres [stat.ML]
English abstract : [en]
We introduce GLR-klUCB, a novel algorithm for the piecewise iid non-stationary bandit problem with bounded rewards. This algorithm combines an efficient bandit algorithm, kl-UCB, with an efficient, parameter-free, changepoint ...
Show more >We introduce GLR-klUCB, a novel algorithm for the piecewise iid non-stationary bandit problem with bounded rewards. This algorithm combines an efficient bandit algorithm, kl-UCB, with an efficient, parameter-free, changepoint detector, the Bernoulli Generalized Likelihood Ratio Test, for which we provide new theoretical guarantees of independent interest. Unlike previous non-stationary bandit algorithms using a change-point detector, GLR-klUCB does not need to be calibrated based on prior knowledge on the arms' means. We prove that this algorithm can attain a $O(\sqrt{TA \Upsilon_T\log(T)})$ regret in $T$ rounds on some ``easy'' instances, where A is the number of arms and $\Upsilon_T$ the number of change-points, without prior knowledge of $\Upsilon_T$. In contrast with recently proposed algorithms that are agnostic to $\Upsilon_T$, we perform a numerical study showing that GLR-klUCB is also very efficient in practice, beyond easy instances.Show less >
Show more >We introduce GLR-klUCB, a novel algorithm for the piecewise iid non-stationary bandit problem with bounded rewards. This algorithm combines an efficient bandit algorithm, kl-UCB, with an efficient, parameter-free, changepoint detector, the Bernoulli Generalized Likelihood Ratio Test, for which we provide new theoretical guarantees of independent interest. Unlike previous non-stationary bandit algorithms using a change-point detector, GLR-klUCB does not need to be calibrated based on prior knowledge on the arms' means. We prove that this algorithm can attain a $O(\sqrt{TA \Upsilon_T\log(T)})$ regret in $T$ rounds on some ``easy'' instances, where A is the number of arms and $\Upsilon_T$ the number of change-points, without prior knowledge of $\Upsilon_T$. In contrast with recently proposed algorithms that are agnostic to $\Upsilon_T$, we perform a numerical study showing that GLR-klUCB is also very efficient in practice, beyond easy instances.Show less >
Language :
Anglais
Popular science :
Non
ANR Project :
Collections :
Source :
Files
- https://hal.inria.fr/hal-02006471v3/document
- Open access
- Access the document
- http://arxiv.org/pdf/1902.01575
- Open access
- Access the document
- https://hal.inria.fr/hal-02006471v3/document
- Open access
- Access the document
- https://hal.inria.fr/hal-02006471v3/document
- Open access
- Access the document
- document
- Open access
- Access the document
- BKMS22%20%281%29.pdf
- Open access
- Access the document
- 1902.01575
- Open access
- Access the document