A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits

Gajane, Pratik; Urvoy, Tanguy; Clérot, Fabrice

Type de document :

Communication dans un congrès avec actes

Titre :

A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits

Auteur(s) :

Gajane, Pratik [Auteur]
Sequential Learning [SEQUEL]
Orange Labs [Lannion]
Urvoy, Tanguy [Auteur]
Orange Labs [Lannion]
Clérot, Fabrice [Auteur]
Orange Labs [Lannion]

Titre de la manifestation scientifique :

Proceedings of the 32nd International Conference on Machine Learning

Ville :

Lille

Pays :

France

Date de début de la manifestation scientifique :

2015-07-06

Date de publication :

2015

Mot(s)-clé(s) en anglais :

MAB
online learning
dueling bandits

Discipline(s) HAL :

Informatique [cs]/Autre [cs.OH]

Résumé en anglais : [en]

We study the K-armed dueling bandit problem which is a variation of the classical Multi-Armed Bandit (MAB) problem in which the learner receives only relative feedback about the selected pairs of arms. We propose a new ...
Lire la suite >We study the K-armed dueling bandit problem which is a variation of the classical Multi-Armed Bandit (MAB) problem in which the learner receives only relative feedback about the selected pairs of arms. We propose a new algorithm called Relative Exponential-weight algorithm for Exploration and Exploitation (REX3) to handle the adversarial utility-based formulation of this problem. This algorithm is a non-trivial extension of the Exponential-weight algorithm for Exploration and Exploitation (EXP3) algorithm. We prove a finite time expected regret upper bound of order O(sqrt(K ln(K)T)) for this algorithm and a general lower bound of order omega(sqrt(KT)). At the end, we provide experimental results using real data from information retrieval applications.Lire moins >

Langue :

Anglais

Comité de lecture :

Oui

Audience :

Internationale

Vulgarisation :

Non

Collections :

Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189

Source :

Harvested from HAL

Fichiers

https://hal.inria.fr/hal-01225614/document
Accès libre
Accéder au document

https://hal.inria.fr/hal-01225614/file/gajane15-supp.zip
Accès libre
Accéder au document

http://arxiv.org/pdf/1601.03855
Accès libre
Accéder au document

https://hal.inria.fr/hal-01225614/document
Accès libre
Accéder au document

https://hal.inria.fr/hal-01225614/document
Accès libre
Accéder au document

document
Accès libre
Accéder au document

rex3_icml.pdf
Accès libre
Accéder au document

gajane15-supp.zip
Accès libre
Accéder au document

1601.03855
Accès libre
Accéder au document

A Relative Exponential Weighing Algorithm ... BibTeX CSV Excel RIS

Fichiers

A Relative Exponential Weighing Algorithm ...

BibTeX

CSV

Excel

RIS