Simultaneous Gaussian Model-Based Clustering ...
Type de document :
Article dans une revue scientifique: Article original
Titre :
Simultaneous Gaussian Model-Based Clustering for Samples of Multiple Origins
Auteur(s) :
Lourme, Alexandre [Auteur]
MOdel for Data Analysis and Learning [MODAL]
Biernacki, Christophe [Auteur]
MOdel for Data Analysis and Learning [MODAL]
MOdel for Data Analysis and Learning [MODAL]
Biernacki, Christophe [Auteur]
MOdel for Data Analysis and Learning [MODAL]
Titre de la revue :
Computational Statistics
Pagination :
371-391
Éditeur :
Springer Verlag
Date de publication :
2013-12-19
ISSN :
0943-4062
Mot(s)-clé(s) en anglais :
Stochastic linear link
Gaussian mixture
Model-based clustering
EM algorithm
Model selection
Biological features
Gaussian mixture
Model-based clustering
EM algorithm
Model selection
Biological features
Discipline(s) HAL :
Statistiques [stat]/Méthodologie [stat.ME]
Résumé en anglais : [en]
Gaussian mixture model-based clustering is now a standard tool to estimate some hypothetical underlying partition of a single dataset. In this paper, we aim to cluster several different datasets at the same time in a context ...
Lire la suite >Gaussian mixture model-based clustering is now a standard tool to estimate some hypothetical underlying partition of a single dataset. In this paper, we aim to cluster several different datasets at the same time in a context where underlying populations, even though different, are not completely unrelated: All individuals are described by the same features and partitions of identical meaning are expected. Justifying from some natural arguments a stochastic linear link between the components of the mixtures associated to each dataset, we propose some parsimonious and meaningful models for a so-called simultaneous clustering method. Maximum likelihood mixture parameters, subject to the linear link constraint, can be easily estimated by a Generalized Expectation Maximization (GEM) algorithm that we describe. Some promising results are obtained in a biological context where simultaneous clustering outperforms independent clustering for partitioning three different subspecies of birds. Further results on ornithological data show that the proposed strategy is robust to the relaxation of the exact descriptor concordance which is one of its main assumptions.Lire moins >
Lire la suite >Gaussian mixture model-based clustering is now a standard tool to estimate some hypothetical underlying partition of a single dataset. In this paper, we aim to cluster several different datasets at the same time in a context where underlying populations, even though different, are not completely unrelated: All individuals are described by the same features and partitions of identical meaning are expected. Justifying from some natural arguments a stochastic linear link between the components of the mixtures associated to each dataset, we propose some parsimonious and meaningful models for a so-called simultaneous clustering method. Maximum likelihood mixture parameters, subject to the linear link constraint, can be easily estimated by a Generalized Expectation Maximization (GEM) algorithm that we describe. Some promising results are obtained in a biological context where simultaneous clustering outperforms independent clustering for partitioning three different subspecies of birds. Further results on ornithological data show that the proposed strategy is robust to the relaxation of the exact descriptor concordance which is one of its main assumptions.Lire moins >
Langue :
Anglais
Comité de lecture :
Oui
Audience :
Internationale
Vulgarisation :
Non
Collections :
Source :
Fichiers
- document
- Accès libre
- Accéder au document
- classifsimul.pdf
- Accès libre
- Accéder au document
- classifsimul.pdf
- Accès libre
- Accéder au document