Boundary Crossing for General Exponential Families
Document type :
Communication dans un congrès avec actes
Title :
Boundary Crossing for General Exponential Families
Author(s) :
Conference title :
Algorithmic Learning Theory
City :
Kyoto
Country :
Japon
Start date of the conference :
2017-10
Book title :
Journal of Machine Learning Research
Journal title :
Proceedings of Algorithmic Learning Theory
Publication date :
2017
English keyword(s) :
Exponential Families
Bregman Concentration
Multi-armed Bandits
Optimality
Bregman Concentration
Multi-armed Bandits
Optimality
HAL domain(s) :
Statistiques [stat]/Machine Learning [stat.ML]
Mathématiques [math]/Statistiques [math.ST]
Mathématiques [math]/Statistiques [math.ST]
English abstract : [en]
We consider parametric exponential families of dimension K on the real line. We study a variant of boundary crossing probabilities coming from the multi-armed bandit literature, in the case when the real-valued distributions ...
Show more >We consider parametric exponential families of dimension K on the real line. We study a variant of boundary crossing probabilities coming from the multi-armed bandit literature, in the case when the real-valued distributions form an exponential family of dimension K. Formally, our result is a concentration inequality that bounds the probability that B ψ (θ n , θ) f (t/n)/n, where θ is the parameter of an unknown target distribution, θ n is the empirical parameter estimate built from n observations, ψ is the log-partition function of the exponential family and B ψ is the corresponding Bregman divergence. From the perspective of stochastic multi-armed bandits, we pay special attention to the case when the boundary function f is logarithmic, as it enables to analyze the regret of the state-of-the-art KL-ucb and KL-ucb+ strategies, whose analysis was left open in such generality. Indeed, previous results only hold for the case when K = 1, while we provide results for arbitrary finite dimension K, thus considerably extending the existing results. Perhaps surprisingly, we highlight that the proof techniques to achieve these strong results already existed three decades ago in the work of T.L. Lai, and were apparently forgotten in the bandit community. We provide a modern rewriting of these beautiful techniques that we believe are useful beyond the application to stochastic multi-armed bandits.Show less >
Show more >We consider parametric exponential families of dimension K on the real line. We study a variant of boundary crossing probabilities coming from the multi-armed bandit literature, in the case when the real-valued distributions form an exponential family of dimension K. Formally, our result is a concentration inequality that bounds the probability that B ψ (θ n , θ) f (t/n)/n, where θ is the parameter of an unknown target distribution, θ n is the empirical parameter estimate built from n observations, ψ is the log-partition function of the exponential family and B ψ is the corresponding Bregman divergence. From the perspective of stochastic multi-armed bandits, we pay special attention to the case when the boundary function f is logarithmic, as it enables to analyze the regret of the state-of-the-art KL-ucb and KL-ucb+ strategies, whose analysis was left open in such generality. Indeed, previous results only hold for the case when K = 1, while we provide results for arbitrary finite dimension K, thus considerably extending the existing results. Perhaps surprisingly, we highlight that the proof techniques to achieve these strong results already existed three decades ago in the work of T.L. Lai, and were apparently forgotten in the bandit community. We provide a modern rewriting of these beautiful techniques that we believe are useful beyond the application to stochastic multi-armed bandits.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
Collections :
Source :
Files
- https://hal.archives-ouvertes.fr/hal-01615427/document
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-01615427/document
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-01615427/document
- Open access
- Access the document
- document
- Open access
- Access the document
- 15.pdf
- Open access
- Access the document
- document
- Open access
- Access the document
- 15.pdf
- Open access
- Access the document