Calibrated Fairness in Bandits
Document type :
Pré-publication ou Document de travail
Title :
Calibrated Fairness in Bandits
Author(s) :
Liu, Yang [Auteur]
Harvard John A. Paulson School of Engineering and Applied Sciences [SEAS]
Radanovic, Goran [Auteur]
Harvard John A. Paulson School of Engineering and Applied Sciences [SEAS]
Dimitrakakis, Christos [Auteur]
Sequential Learning [SEQUEL]
Mandal, Debmalya [Auteur]
Harvard John A. Paulson School of Engineering and Applied Sciences [SEAS]
Parkes, David [Auteur]
Harvard John A. Paulson School of Engineering and Applied Sciences [SEAS]
Harvard John A. Paulson School of Engineering and Applied Sciences [SEAS]
Radanovic, Goran [Auteur]
Harvard John A. Paulson School of Engineering and Applied Sciences [SEAS]
Dimitrakakis, Christos [Auteur]
Sequential Learning [SEQUEL]
Mandal, Debmalya [Auteur]
Harvard John A. Paulson School of Engineering and Applied Sciences [SEAS]
Parkes, David [Auteur]
Harvard John A. Paulson School of Engineering and Applied Sciences [SEAS]
HAL domain(s) :
Statistiques [stat]/Machine Learning [stat.ML]
Statistiques [stat]/Autres [stat.ML]
Statistiques [stat]/Autres [stat.ML]
English abstract : [en]
We study fairness within the stochastic, multi-armed bandit (MAB) decision making framework. We adapt the fairness framework of "treating similar individuals similarly" [5] to this seing. Here, an 'individual' corresponds ...
Show more >We study fairness within the stochastic, multi-armed bandit (MAB) decision making framework. We adapt the fairness framework of "treating similar individuals similarly" [5] to this seing. Here, an 'individual' corresponds to an arm and two arms are 'similar' if they have a similar quality distribution. First, we adopt a smoothness constraint that if two arms have a similar quality distribution then the probability of selecting each arm should be similar. In addition, we dene the fairness regret, which corresponds to the degree to which an algorithm is not calibrated, where perfect calibration requires that the probability of selecting an arm is equal to the probability with which the arm has the best quality realization. We show that a variation on ompson sampling satises smooth fairness for total variation distance, and give añ O((kT) 2/3) bound on fairness regret. is complements prior work [12], which protects an on-average beer arm from being less favored. We also explain how to extend our algorithm to the dueling bandit seing. ACM Reference format:Show less >
Show more >We study fairness within the stochastic, multi-armed bandit (MAB) decision making framework. We adapt the fairness framework of "treating similar individuals similarly" [5] to this seing. Here, an 'individual' corresponds to an arm and two arms are 'similar' if they have a similar quality distribution. First, we adopt a smoothness constraint that if two arms have a similar quality distribution then the probability of selecting each arm should be similar. In addition, we dene the fairness regret, which corresponds to the degree to which an algorithm is not calibrated, where perfect calibration requires that the probability of selecting an arm is equal to the probability with which the arm has the best quality realization. We show that a variation on ompson sampling satises smooth fairness for total variation distance, and give añ O((kT) 2/3) bound on fairness regret. is complements prior work [12], which protects an on-average beer arm from being less favored. We also explain how to extend our algorithm to the dueling bandit seing. ACM Reference format:Show less >
Language :
Anglais
Collections :
Source :
Files
- https://hal.inria.fr/hal-01953314/document
- Open access
- Access the document
- https://hal.inria.fr/hal-01953314/document
- Open access
- Access the document
- https://hal.inria.fr/hal-01953314/document
- Open access
- Access the document
- https://hal.inria.fr/hal-01953314/document
- Open access
- Access the document
- document
- Open access
- Access the document
- 1707.01875.pdf
- Open access
- Access the document