Covariance-adapting algorithm for semi-bandits ...
Document type :
Communication dans un congrès avec actes
Title :
Covariance-adapting algorithm for semi-bandits with application to sparse outcomes
Author(s) :
Perrault, Pierre [Auteur]
Scool [Scool]
Sequential Learning [SEQUEL]
Ecole Normale Supérieure Paris-Saclay [ENS Paris Saclay]
Perchet, Vianney [Auteur]
Ecole Nationale de la Statistique et de l'Analyse Economique [ENSAE]
Valko, Michal [Auteur]
DeepMind [Paris]
Scool [Scool]
Sequential Learning [SEQUEL]
Ecole Normale Supérieure Paris-Saclay [ENS Paris Saclay]
Perchet, Vianney [Auteur]
Ecole Nationale de la Statistique et de l'Analyse Economique [ENSAE]
Valko, Michal [Auteur]

DeepMind [Paris]
Conference title :
Conference on Learning Theory
City :
Graz
Country :
Autriche
Start date of the conference :
2020
Publication date :
2020
English keyword(s) :
combinatorial stochastic semi-bandits
covariance
confidence ellipsoid
sparsity
covariance
confidence ellipsoid
sparsity
HAL domain(s) :
Mathématiques [math]
Mathématiques [math]/Statistiques [math.ST]
Informatique [cs]
Informatique [cs]/Apprentissage [cs.LG]
Mathématiques [math]/Statistiques [math.ST]
Informatique [cs]
Informatique [cs]/Apprentissage [cs.LG]
English abstract : [en]
We investigate stochastic combinatorial semi-bandits, where the entire joint distribution of outcomes impacts the complexity of the problem instance (unlike in the standard bandits). Typical distributions considered depend ...
Show more >We investigate stochastic combinatorial semi-bandits, where the entire joint distribution of outcomes impacts the complexity of the problem instance (unlike in the standard bandits). Typical distributions considered depend on specific parameter values, whose prior knowledge is required in theory but quite difficult to estimate in practice; an example is the commonly assumed sub-Gaussian family. We alleviate this issue by instead considering a new general family of sub-exponential distributions, which contains bounded and Gaussian ones. We prove a new lower bound on the regret on this family, that is parameterized by the unknown covariance matrix, a tighter quantity than the sub-Gaussian matrix. We then construct an algorithm that uses covariance estimates, and provide a tight asymptotic analysis of the regret. Finally, we apply and extend our results to the family of sparse outcomes, which has applications in many recommender systems.Show less >
Show more >We investigate stochastic combinatorial semi-bandits, where the entire joint distribution of outcomes impacts the complexity of the problem instance (unlike in the standard bandits). Typical distributions considered depend on specific parameter values, whose prior knowledge is required in theory but quite difficult to estimate in practice; an example is the commonly assumed sub-Gaussian family. We alleviate this issue by instead considering a new general family of sub-exponential distributions, which contains bounded and Gaussian ones. We prove a new lower bound on the regret on this family, that is parameterized by the unknown covariance matrix, a tighter quantity than the sub-Gaussian matrix. We then construct an algorithm that uses covariance estimates, and provide a tight asymptotic analysis of the regret. Finally, we apply and extend our results to the family of sparse outcomes, which has applications in many recommender systems.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
Collections :
Source :
Files
- https://hal.archives-ouvertes.fr/hal-02876102/document
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-02876102/document
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-02876102/document
- Open access
- Access the document
- document
- Open access
- Access the document
- colt.pdf
- Open access
- Access the document
- document
- Open access
- Access the document
- colt.pdf
- Open access
- Access the document