Simultaneous Gaussian Model-Based Clustering ...
Document type :
Article dans une revue scientifique
Permalink :
Title :
Simultaneous Gaussian Model-Based Clustering for Samples of Multiple Origins
Author(s) :
Journal title :
Computational Statistics
Volume number :
152
Pages :
371-391
Publisher :
Springer Verlag
Publication date :
2013-12-19
ISSN :
0943-4062
HAL domain(s) :
Statistiques [stat]/Méthodologie [stat.ME]
English abstract : [en]
Gaussian mixture model-based clustering is now a standard tool to estimate some hypothetical underlying partition of a single dataset. In this paper, we aim to cluster several different datasets at the same time in a context ...
Show more >Gaussian mixture model-based clustering is now a standard tool to estimate some hypothetical underlying partition of a single dataset. In this paper, we aim to cluster several different datasets at the same time in a context where underlying populations, even though different, are not completely unrelated: All individuals are described by the same features and partitions of identical meaning are expected. Justifying from some natural arguments a stochastic linear link between the components of the mixtures associated to each dataset, we propose some parsimonious and meaningful models for a so-called simultaneous clustering method. Maximum likelihood mixture parameters, subject to the linear link constraint, can be easily estimated by a Generalized Expectation Maximization (GEM) algorithm that we describe. Some promising results are obtained in a biological context where simultaneous clustering outperforms independent clustering for partitioning three different subspecies of birds. Further results on ornithological data show that the proposed strategy is robust to the relaxation of the exact descriptor concordance which is one of its main assumptions.Show less >
Show more >Gaussian mixture model-based clustering is now a standard tool to estimate some hypothetical underlying partition of a single dataset. In this paper, we aim to cluster several different datasets at the same time in a context where underlying populations, even though different, are not completely unrelated: All individuals are described by the same features and partitions of identical meaning are expected. Justifying from some natural arguments a stochastic linear link between the components of the mixtures associated to each dataset, we propose some parsimonious and meaningful models for a so-called simultaneous clustering method. Maximum likelihood mixture parameters, subject to the linear link constraint, can be easily estimated by a Generalized Expectation Maximization (GEM) algorithm that we describe. Some promising results are obtained in a biological context where simultaneous clustering outperforms independent clustering for partitioning three different subspecies of birds. Further results on ornithological data show that the proposed strategy is robust to the relaxation of the exact descriptor concordance which is one of its main assumptions.Show less >
Language :
Anglais
Audience :
Internationale
Popular science :
Non
Submission date :
2020-06-08T14:11:37Z
2020-06-09T09:12:11Z
2020-06-09T09:12:11Z
Files
- documen
- Open access
- Access the document