Simultaneous Gaussian Model-Based Clustering ...
Type de document :
Article dans une revue scientifique
URL permanente :
Titre :
Simultaneous Gaussian Model-Based Clustering for Samples of Multiple Origins
Auteur(s) :
Titre de la revue :
Computational Statistics
Numéro :
152
Pagination :
371-391
Éditeur :
Springer Verlag
Date de publication :
2013-12-19
ISSN :
0943-4062
Discipline(s) HAL :
Statistiques [stat]/Méthodologie [stat.ME]
Résumé en anglais : [en]
Gaussian mixture model-based clustering is now a standard tool to estimate some hypothetical underlying partition of a single dataset. In this paper, we aim to cluster several different datasets at the same time in a context ...
Lire la suite >Gaussian mixture model-based clustering is now a standard tool to estimate some hypothetical underlying partition of a single dataset. In this paper, we aim to cluster several different datasets at the same time in a context where underlying populations, even though different, are not completely unrelated: All individuals are described by the same features and partitions of identical meaning are expected. Justifying from some natural arguments a stochastic linear link between the components of the mixtures associated to each dataset, we propose some parsimonious and meaningful models for a so-called simultaneous clustering method. Maximum likelihood mixture parameters, subject to the linear link constraint, can be easily estimated by a Generalized Expectation Maximization (GEM) algorithm that we describe. Some promising results are obtained in a biological context where simultaneous clustering outperforms independent clustering for partitioning three different subspecies of birds. Further results on ornithological data show that the proposed strategy is robust to the relaxation of the exact descriptor concordance which is one of its main assumptions.Lire moins >
Lire la suite >Gaussian mixture model-based clustering is now a standard tool to estimate some hypothetical underlying partition of a single dataset. In this paper, we aim to cluster several different datasets at the same time in a context where underlying populations, even though different, are not completely unrelated: All individuals are described by the same features and partitions of identical meaning are expected. Justifying from some natural arguments a stochastic linear link between the components of the mixtures associated to each dataset, we propose some parsimonious and meaningful models for a so-called simultaneous clustering method. Maximum likelihood mixture parameters, subject to the linear link constraint, can be easily estimated by a Generalized Expectation Maximization (GEM) algorithm that we describe. Some promising results are obtained in a biological context where simultaneous clustering outperforms independent clustering for partitioning three different subspecies of birds. Further results on ornithological data show that the proposed strategy is robust to the relaxation of the exact descriptor concordance which is one of its main assumptions.Lire moins >
Langue :
Anglais
Audience :
Internationale
Vulgarisation :
Non
Date de dépôt :
2020-06-08T14:11:37Z
2020-06-09T09:12:11Z
2020-06-09T09:12:11Z
Fichiers
- documen
- Accès libre
- Accéder au document