Multi-Partitions Subspace Clustering
Type de document :
Article dans une revue scientifique: Article original
DOI :
URL permanente :
Titre :
Multi-Partitions Subspace Clustering
Auteur(s) :
Vandewalle, Vincent [Auteur]
MOdel for Data Analysis and Learning [MODAL]
METRICS : Evaluation des technologies de santé et des pratiques médicales - ULR 2694

MOdel for Data Analysis and Learning [MODAL]
METRICS : Evaluation des technologies de santé et des pratiques médicales - ULR 2694
Titre de la revue :
Mathematics
Nom court de la revue :
Mathematics
Numéro :
8
Date de publication :
2020-06-06
ISSN :
2227-7390
Mot(s)-clé(s) en anglais :
clustering
mixture model
factorial discriminant analysis
EM algorithm
mixture model
factorial discriminant analysis
EM algorithm
Discipline(s) HAL :
Sciences du Vivant [q-bio]
Résumé en anglais : [en]
In model based clustering, it is often supposed that only one clustering latent variable explains the heterogeneity of the whole dataset. However, in many cases several latent variables could explain the heterogeneity of ...
Lire la suite >In model based clustering, it is often supposed that only one clustering latent variable explains the heterogeneity of the whole dataset. However, in many cases several latent variables could explain the heterogeneity of the data at hand. Finding such class variables could result in a richer interpretation of the data. In the continuous data setting, a multi-partition model based clustering is proposed. It assumes the existence of several latent clustering variables, each one explaining the heterogeneity of the data with respect to some clustering subspace. It allows to simultaneously find the multi-partitions and the related subspaces. Parameters of the model are estimated through an EM algorithm relying on a probabilistic reinterpretation of the factorial discriminant analysis. A model choice strategy relying on the BIC criterion is proposed to select to number of subspaces and the number of clusters by subspace. The obtained results are thus several projections of the data, each one conveying its own clustering of the data. Model’s behavior is illustrated on simulated and real data.Lire moins >
Lire la suite >In model based clustering, it is often supposed that only one clustering latent variable explains the heterogeneity of the whole dataset. However, in many cases several latent variables could explain the heterogeneity of the data at hand. Finding such class variables could result in a richer interpretation of the data. In the continuous data setting, a multi-partition model based clustering is proposed. It assumes the existence of several latent clustering variables, each one explaining the heterogeneity of the data with respect to some clustering subspace. It allows to simultaneously find the multi-partitions and the related subspaces. Parameters of the model are estimated through an EM algorithm relying on a probabilistic reinterpretation of the factorial discriminant analysis. A model choice strategy relying on the BIC criterion is proposed to select to number of subspaces and the number of clusters by subspace. The obtained results are thus several projections of the data, each one conveying its own clustering of the data. Model’s behavior is illustrated on simulated and real data.Lire moins >
Langue :
Anglais
Audience :
Internationale
Vulgarisation :
Non
Établissement(s) :
Université de Lille
CHU Lille
CHU Lille
Date de dépôt :
2023-11-15T09:42:31Z
2023-12-13T13:55:00Z
2023-12-13T13:55:00Z
Fichiers
- mathematics-08-00597-v2.pdf
- Non spécifié
- Accès libre
- Accéder au document