Mixture Martingales Revisited with ...
Document type :
Article dans une revue scientifique: Article original
Title :
Mixture Martingales Revisited with Applications to Sequential Tests and Confidence Intervals
Author(s) :
Kaufmann, Emilie [Auteur]
Centre National de la Recherche Scientifique [CNRS]
Scool [Scool]
Koolen, Wouter [Auteur]
Centrum Wiskunde & Informatica [CWI]

Centre National de la Recherche Scientifique [CNRS]
Scool [Scool]
Koolen, Wouter [Auteur]
Centrum Wiskunde & Informatica [CWI]
Journal title :
Journal of Machine Learning Research
Publisher :
Microtome Publishing
Publication date :
2021-12-06
ISSN :
1532-4435
English keyword(s) :
multi-armed bandits
best arm identification
test martingales
adaptive sequential testing
mixture methods
best arm identification
test martingales
adaptive sequential testing
mixture methods
HAL domain(s) :
Statistiques [stat]/Machine Learning [stat.ML]
English abstract : [en]
This paper presents new deviation inequalities that are valid uniformly in time under adaptive sampling in a multi-armed bandit model. The deviations are measured using the Kullback-Leibler divergence in a given one-dimensional ...
Show more >This paper presents new deviation inequalities that are valid uniformly in time under adaptive sampling in a multi-armed bandit model. The deviations are measured using the Kullback-Leibler divergence in a given one-dimensional exponential family, and may take into account several arms at a time. They are obtained by constructing for each arm a mixture martingale based on a hierarchical prior, and by multiplying those martingales. Our deviation inequalities allow us to analyze stopping rules based on generalized likelihood ratios for a large class of sequential identification problems, and to construct tight confidence intervals for some functions of the means of the arms.Show less >
Show more >This paper presents new deviation inequalities that are valid uniformly in time under adaptive sampling in a multi-armed bandit model. The deviations are measured using the Kullback-Leibler divergence in a given one-dimensional exponential family, and may take into account several arms at a time. They are obtained by constructing for each arm a mixture martingale based on a hierarchical prior, and by multiplying those martingales. Our deviation inequalities allow us to analyze stopping rules based on generalized likelihood ratios for a large class of sequential identification problems, and to construct tight confidence intervals for some functions of the means of the arms.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
ANR Project :
Collections :
Source :
Files
- https://hal.archives-ouvertes.fr/hal-01886612v3/document
- Open access
- Access the document
- http://arxiv.org/pdf/1811.11419
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-01886612v3/document
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-01886612v3/document
- Open access
- Access the document
- document
- Open access
- Access the document
- KK21.pdf
- Open access
- Access the document
- 1811.11419
- Open access
- Access the document