Dealing with Unknown Variances in Best-Arm ...
Type de document :
Communication dans un congrès avec actes
Titre :
Dealing with Unknown Variances in Best-Arm Identification
Auteur(s) :
Jourdan, Marc [Auteur]
Scool [Scool]
Degenne, Rémy [Auteur]
Scool [Scool]
Kaufmann, Emilie [Auteur]
Centre de Recherche Réseau Image SysTème Architecture et MuLtimédia [CRISTAL]
Scool [Scool]
Degenne, Rémy [Auteur]
Scool [Scool]
Kaufmann, Emilie [Auteur]
![refId](/themes/Mirage2//images/idref.png)
Centre de Recherche Réseau Image SysTème Architecture et MuLtimédia [CRISTAL]
Titre de la manifestation scientifique :
Algorithmic Learning Theory (ALT)
Ville :
Singapore (SG)
Pays :
Singapour
Date de début de la manifestation scientifique :
2023-02-20
Titre de l’ouvrage :
Proceedings of Machine Learning Research (PMLR)
Mot(s)-clé(s) en anglais :
Bandits
Best arm identification
Best arm identification
Discipline(s) HAL :
Statistiques [stat]/Autres [stat.ML]
Résumé en anglais : [en]
The problem of identifying the best arm among a collection of items having Gaussian rewards distribution is well understood when the variances are known. Despite its practical relevance for many applications, few works ...
Lire la suite >The problem of identifying the best arm among a collection of items having Gaussian rewards distribution is well understood when the variances are known. Despite its practical relevance for many applications, few works studied it for unknown variances. In this paper we introduce and analyze two approaches to deal with unknown variances, either by plugging in the empirical variance or by adapting the transportation costs. In order to calibrate our two stopping rules, we derive new time-uniform concentration inequalities, which are of independent interest. Then, we illustrate the theoretical and empirical performances of our two sampling rule wrappers on Track-and-Stop and on a Top Two algorithm. Moreover, by quantifying the impact on the sample complexity of not knowing the variances, we reveal that it is rather small.Lire moins >
Lire la suite >The problem of identifying the best arm among a collection of items having Gaussian rewards distribution is well understood when the variances are known. Despite its practical relevance for many applications, few works studied it for unknown variances. In this paper we introduce and analyze two approaches to deal with unknown variances, either by plugging in the empirical variance or by adapting the transportation costs. In order to calibrate our two stopping rules, we derive new time-uniform concentration inequalities, which are of independent interest. Then, we illustrate the theoretical and empirical performances of our two sampling rule wrappers on Track-and-Stop and on a Top Two algorithm. Moreover, by quantifying the impact on the sample complexity of not knowing the variances, we reveal that it is rather small.Lire moins >
Langue :
Anglais
Comité de lecture :
Oui
Audience :
Internationale
Vulgarisation :
Non
Projet ANR :
Collections :
Source :
Fichiers
- document
- Accès libre
- Accéder au document
- BAIUV.pdf
- Accès libre
- Accéder au document
- 2210.00974
- Accès libre
- Accéder au document