FLamby: Datasets and Benchmarks for ...
Type de document :
Communication dans un congrès avec actes
Titre :
FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings
Auteur(s) :
Terrail, Jean Ogier Du [Auteur]
Owkin France
Ayed, Samy-Safwan [Auteur]
Université Côte d'Azur [UniCA]
Cyffers, Edwige [Auteur]
Machine Learning in Information Networks [MAGNET]
Grimberg, Felix [Auteur]
Ecole Polytechnique Fédérale de Lausanne [EPFL]
He, Chaoyang [Auteur]
FedML, Inc [FedML]
Loeb, Regis [Auteur]
Owkin France
Mangold, Paul [Auteur]
Machine Learning in Information Networks [MAGNET]
Marchand, Tanguy [Auteur]
Owkin France
Marfoq, Othmane [Auteur]
Institut Agronomique Néo-Calédonien [IAC]
Mushtaq, Erum [Auteur]
University of Southern California [USC]
Muzellec, Boris [Auteur]
Owkin France
Philippenko, Constantin [Auteur]
Centre de Mathématiques Appliquées de l'Ecole polytechnique [CMAP]
Silva, Santiago [Auteur]
Université Côte d'Azur [UniCA]
Teleńczuk, Maria [Auteur]
Owkin France
Albarqouni, Shadi [Auteur]
Helmholtz Munich, Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH)
University Hospital Bonn
Avestimehr, Salman [Auteur]
FedML, Inc [FedML]
University of Southern California [USC]
Bellet, Aurelien [Auteur]
Machine Learning in Information Networks [MAGNET]
Dieuleveut, Aymeric [Auteur]
Centre de Mathématiques Appliquées de l'Ecole polytechnique [CMAP]
Jaggi, Martin [Auteur]
Ecole Polytechnique Fédérale de Lausanne [EPFL]
Karimireddy, Sai Praneeth [Auteur]
University of California [Berkeley] [UC Berkeley]
Lorenzi, Marco [Auteur]
E-Patient : Images, données & mOdèles pour la médeciNe numériquE [EPIONE]
Neglia, Giovanni [Auteur]
Network Engineering and Operations [NEO]
Tommasi, Marc [Auteur]
Machine Learning in Information Networks [MAGNET]
Andreux, Mathieu [Auteur]
Owkin France
Owkin France
Ayed, Samy-Safwan [Auteur]
Université Côte d'Azur [UniCA]
Cyffers, Edwige [Auteur]
Machine Learning in Information Networks [MAGNET]
Grimberg, Felix [Auteur]
Ecole Polytechnique Fédérale de Lausanne [EPFL]
He, Chaoyang [Auteur]
FedML, Inc [FedML]
Loeb, Regis [Auteur]
Owkin France
Mangold, Paul [Auteur]
Machine Learning in Information Networks [MAGNET]
Marchand, Tanguy [Auteur]
Owkin France
Marfoq, Othmane [Auteur]
Institut Agronomique Néo-Calédonien [IAC]
Mushtaq, Erum [Auteur]
University of Southern California [USC]
Muzellec, Boris [Auteur]
Owkin France
Philippenko, Constantin [Auteur]
Centre de Mathématiques Appliquées de l'Ecole polytechnique [CMAP]
Silva, Santiago [Auteur]
Université Côte d'Azur [UniCA]
Teleńczuk, Maria [Auteur]
Owkin France
Albarqouni, Shadi [Auteur]
Helmholtz Munich, Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH)
University Hospital Bonn
Avestimehr, Salman [Auteur]
FedML, Inc [FedML]
University of Southern California [USC]
Bellet, Aurelien [Auteur]

Machine Learning in Information Networks [MAGNET]
Dieuleveut, Aymeric [Auteur]
Centre de Mathématiques Appliquées de l'Ecole polytechnique [CMAP]
Jaggi, Martin [Auteur]
Ecole Polytechnique Fédérale de Lausanne [EPFL]
Karimireddy, Sai Praneeth [Auteur]
University of California [Berkeley] [UC Berkeley]
Lorenzi, Marco [Auteur]
E-Patient : Images, données & mOdèles pour la médeciNe numériquE [EPIONE]
Neglia, Giovanni [Auteur]
Network Engineering and Operations [NEO]
Tommasi, Marc [Auteur]

Machine Learning in Information Networks [MAGNET]
Andreux, Mathieu [Auteur]
Owkin France
Titre de la manifestation scientifique :
NeurIPS 2022 - Thirty-sixth Conference on Neural Information Processing Systems
Ville :
New Orleans
Pays :
Etats-Unis d'Amérique
Date de début de la manifestation scientifique :
2022-11-28
Titre de la revue :
Proceedings of NeurIPS
Discipline(s) HAL :
Informatique [cs]/Apprentissage [cs.LG]
Résumé en anglais : [en]
Federated Learning (FL) is a novel approach enabling several clients holding sensitive data to collaboratively train machine learning models, without centralizing data. The cross-silo FL setting corresponds to the case of ...
Lire la suite >Federated Learning (FL) is a novel approach enabling several clients holding sensitive data to collaboratively train machine learning models, without centralizing data. The cross-silo FL setting corresponds to the case of few ($2$--$50$) reliable clients, each holding medium to large datasets, and is typically found in applications such as healthcare, finance, or industry. While previous works have proposed representative datasets for cross-device FL, few realistic healthcare cross-silo FL datasets exist, thereby slowing algorithmic research in this critical application. In this work, we propose a novel cross-silo dataset suite focused on healthcare, FLamby (Federated Learning AMple Benchmark of Your cross-silo strategies), to bridge the gap between theory and practice of cross-silo FL. FLamby encompasses 7 healthcare datasets with natural splits, covering multiple tasks, modalities, and data volumes, each accompanied with baseline training code. As an illustration, we additionally benchmark standard FL algorithms on all datasets. Our flexible and modular suite allows researchers to easily download datasets, reproduce results and re-use the different components for their research. FLamby is available at~\url{www.github.com/owkin/flamby}.Lire moins >
Lire la suite >Federated Learning (FL) is a novel approach enabling several clients holding sensitive data to collaboratively train machine learning models, without centralizing data. The cross-silo FL setting corresponds to the case of few ($2$--$50$) reliable clients, each holding medium to large datasets, and is typically found in applications such as healthcare, finance, or industry. While previous works have proposed representative datasets for cross-device FL, few realistic healthcare cross-silo FL datasets exist, thereby slowing algorithmic research in this critical application. In this work, we propose a novel cross-silo dataset suite focused on healthcare, FLamby (Federated Learning AMple Benchmark of Your cross-silo strategies), to bridge the gap between theory and practice of cross-silo FL. FLamby encompasses 7 healthcare datasets with natural splits, covering multiple tasks, modalities, and data volumes, each accompanied with baseline training code. As an illustration, we additionally benchmark standard FL algorithms on all datasets. Our flexible and modular suite allows researchers to easily download datasets, reproduce results and re-use the different components for their research. FLamby is available at~\url{www.github.com/owkin/flamby}.Lire moins >
Langue :
Anglais
Comité de lecture :
Oui
Audience :
Internationale
Vulgarisation :
Non
Projet ANR :
Collections :
Source :
Fichiers
- 2210.04620
- Accès libre
- Accéder au document