Multi-view Clustering of Heterogeneous ...
Type de document :
Autre communication scientifique (congrès sans actes - poster - séminaire...): Communication dans un congrès avec actes
Titre :
Multi-view Clustering of Heterogeneous Health Data: Application to Systemic Sclerosis
Auteur(s) :
José-García, Adán [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Operational Research, Knowledge And Data [ORKAD]
Jacques, Julie [Auteur]
Université catholique de Lille [UCL]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Filiot, Alexandre [Auteur]
Institute for Translational Research in Inflammation - U 1286 [INFINITE]
Handl, Julia [Auteur]
University of Manchester [Manchester]
Launay, David [Auteur]
Institute for Translational Research in Inflammation - U 1286 [INFINITE]
Sobanski, Vincent [Auteur]
Institute for Translational Research in Inflammation - U 1286 [INFINITE]
Institut universitaire de France [IUF]
Dhaenens, Clarisse [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Operational Research, Knowledge And Data [ORKAD]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Operational Research, Knowledge And Data [ORKAD]
Jacques, Julie [Auteur]
Université catholique de Lille [UCL]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Filiot, Alexandre [Auteur]
Institute for Translational Research in Inflammation - U 1286 [INFINITE]
Handl, Julia [Auteur]
University of Manchester [Manchester]
Launay, David [Auteur]

Institute for Translational Research in Inflammation - U 1286 [INFINITE]
Sobanski, Vincent [Auteur]

Institute for Translational Research in Inflammation - U 1286 [INFINITE]
Institut universitaire de France [IUF]
Dhaenens, Clarisse [Auteur]

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Operational Research, Knowledge And Data [ORKAD]
Titre de la manifestation scientifique :
Parallel Problem Solving from Nature – PPSN XVII
Ville :
Dortmund
Pays :
Allemagne
Date de début de la manifestation scientifique :
2022-09-10
Éditeur :
Springer International Publishing
Date de publication :
2022-08-15
Discipline(s) HAL :
Informatique [cs]
Informatique [cs]/Apprentissage [cs.LG]
Informatique [cs]/Bio-informatique [q-bio.QM]
Statistiques [stat]/Machine Learning [stat.ML]
Informatique [cs]/Apprentissage [cs.LG]
Informatique [cs]/Bio-informatique [q-bio.QM]
Statistiques [stat]/Machine Learning [stat.ML]
Résumé en anglais : [en]
Electronic health records (EHRs) involve heterogeneous data types such as binary, numeric and categorical attributes. As traditional clustering approaches require the definition of a single proximity measure, different ...
Lire la suite >Electronic health records (EHRs) involve heterogeneous data types such as binary, numeric and categorical attributes. As traditional clustering approaches require the definition of a single proximity measure, different data types are typically transformed into a common format or amalgamated through a single distance function. Unfortunately, this early transformation step largely pre-determines the cluster analysis results and can cause information loss, as the relative importance of different attributes is not considered. This exploratory work aims to avoid this premature integration of attribute types prior to cluster analysis through a multi-objective evolutionary algorithm called MVMC. This approach allows multiple data types to be integrated into the clustering process, explore trade-offs between them, and determine consensus clusters that are supported across these data views. We evaluate our approach in a case study focusing on systemic sclerosis (SSc), a highly heterogeneous auto-immune disease that can be considered a representative example of an EHRs data problem. Our results highlight the potential benefits of multi-view learning in an EHR context. Furthermore, this comprehensive classification integrating multiple and various data sources will help to understand better disease complications and treatment goals.Lire moins >
Lire la suite >Electronic health records (EHRs) involve heterogeneous data types such as binary, numeric and categorical attributes. As traditional clustering approaches require the definition of a single proximity measure, different data types are typically transformed into a common format or amalgamated through a single distance function. Unfortunately, this early transformation step largely pre-determines the cluster analysis results and can cause information loss, as the relative importance of different attributes is not considered. This exploratory work aims to avoid this premature integration of attribute types prior to cluster analysis through a multi-objective evolutionary algorithm called MVMC. This approach allows multiple data types to be integrated into the clustering process, explore trade-offs between them, and determine consensus clusters that are supported across these data views. We evaluate our approach in a case study focusing on systemic sclerosis (SSc), a highly heterogeneous auto-immune disease that can be considered a representative example of an EHRs data problem. Our results highlight the potential benefits of multi-view learning in an EHR context. Furthermore, this comprehensive classification integrating multiple and various data sources will help to understand better disease complications and treatment goals.Lire moins >
Langue :
Anglais
Comité de lecture :
Oui
Audience :
Internationale
Vulgarisation :
Non
Collections :
Source :
Fichiers
- 2022_PPSN_paper_1_.pdf
- Accès libre
- Accéder au document