• English
    • français
  • Help
  •  | 
  • Contact
  •  | 
  • About
  •  | 
  • Login
  • HAL portal
  •  | 
  • Pages Pro
  • EN
  •  / 
  • FR
View Item 
  •   LillOA Home
  • Liste des unités
  • Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
  • View Item
  •   LillOA Home
  • Liste des unités
  • Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Efficient Change-Point Detection for ...
  • BibTeX
  • CSV
  • Excel
  • RIS

Document type :
Article dans une revue scientifique
Title :
Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits
Author(s) :
Besson, Lilian [Auteur]
École normale supérieure - Rennes [ENS Rennes]
Kaufmann, Emilie [Auteur] refId
Scool [Scool]
Maillard, Odalric-Ambrym [Auteur] refId
Scool [Scool]
Seznec, Julien [Auteur]
Lelivrescolaire.fr
Scool [Scool]
Journal title :
Journal of Machine Learning Research
Publisher :
Microtome Publishing
Publication date :
2022-03
ISSN :
1532-4435
English keyword(s) :
Multi-Armed Bandits
Non-Stationary Bandits
Change Point Detection
HAL domain(s) :
Statistiques [stat]/Autres [stat.ML]
English abstract : [en]
We introduce GLR-klUCB, a novel algorithm for the piecewise iid non-stationary bandit problem with bounded rewards. This algorithm combines an efficient bandit algorithm, kl-UCB, with an efficient, parameter-free, changepoint ...
Show more >
We introduce GLR-klUCB, a novel algorithm for the piecewise iid non-stationary bandit problem with bounded rewards. This algorithm combines an efficient bandit algorithm, kl-UCB, with an efficient, parameter-free, changepoint detector, the Bernoulli Generalized Likelihood Ratio Test, for which we provide new theoretical guarantees of independent interest. Unlike previous non-stationary bandit algorithms using a change-point detector, GLR-klUCB does not need to be calibrated based on prior knowledge on the arms' means. We prove that this algorithm can attain a $O(\sqrt{TA \Upsilon_T\log(T)})$ regret in $T$ rounds on some ``easy'' instances, where A is the number of arms and $\Upsilon_T$ the number of change-points, without prior knowledge of $\Upsilon_T$. In contrast with recently proposed algorithms that are agnostic to $\Upsilon_T$, we perform a numerical study showing that GLR-klUCB is also very efficient in practice, beyond easy instances.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
ANR Project :
BANDITS MANCHOTS POUR SIGNAUX NON-STATIONNAIRES ET STRUCTURES
Au delà de l'apprentissage séquentiel pour de meilleures prises de décisions
Collections :
  • Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
Source :
Harvested from HAL
Files
Thumbnail
  • https://hal.inria.fr/hal-02006471v3/document
  • Open access
  • Access the document
Thumbnail
  • http://arxiv.org/pdf/1902.01575
  • Open access
  • Access the document
Thumbnail
  • https://hal.inria.fr/hal-02006471v3/document
  • Open access
  • Access the document
Thumbnail
  • https://hal.inria.fr/hal-02006471v3/document
  • Open access
  • Access the document
Université de Lille

Mentions légales
Université de Lille © 2017