A tractable multi-partitions clustering
Document type :
Article dans une revue scientifique: Article original
Permalink :
Title :
A tractable multi-partitions clustering
Author(s) :
Marbac, Matthieu [Auteur]
Centre de Recherche en Économie et Statistique [CREST]
Vandewalle, Vincent [Auteur]
221576|||Evaluation des technologies de santé et des pratiques médicales - ULR 2694 [METRICS] (VALID)
Centre de Recherche en Économie et Statistique [CREST]
Vandewalle, Vincent [Auteur]
221576|||Evaluation des technologies de santé et des pratiques médicales - ULR 2694 [METRICS] (VALID)
Journal title :
Computational Statistics and Data Analysis
Abbreviated title :
Comput. Stat. Data Anal.
Volume number :
132
Pages :
167-179
Publisher :
Elsevier
Publication date :
2019-04-01
ISSN :
0167-9473
English keyword(s) :
Model choice
Mixture model
Model-based clustering
Mixed-data
Variables selection
Mixture model
Model-based clustering
Mixed-data
Variables selection
HAL domain(s) :
Sciences du Vivant [q-bio]
English abstract : [en]
In the framework of model-based clustering, a model allowing several latent class variables is proposed. This model assumes that the distribution of the observed data can be factorized into several independent blocks of ...
Show more >In the framework of model-based clustering, a model allowing several latent class variables is proposed. This model assumes that the distribution of the observed data can be factorized into several independent blocks of variables. Each block is assumed to follow a latent class model (i.e., mixture with conditional independence assumption). The proposed model includes variable selection, as a special case, and is able to cope with the mixed-data setting. The simplicity of the model allows to estimate the repartition of the variables into blocks and the mixture parameters simultaneously, thus avoiding to run EM algorithms for each possible repartition of variables into blocks. For the proposed method, a model is defined by the number of blocks, the number of clusters inside each block and the repartition of variables into block. Model selection can be done with two information criteria, the BIC and the MICL, for which an efficient optimization is proposed. The performances of the model are investigated on simulated and real data. It is shown that the proposed method gives a rich interpretation of the data set at hand (i.e., analysis of the repartition of the variables into blocks and analysis of the clusters produced by each block of variables).Show less >
Show more >In the framework of model-based clustering, a model allowing several latent class variables is proposed. This model assumes that the distribution of the observed data can be factorized into several independent blocks of variables. Each block is assumed to follow a latent class model (i.e., mixture with conditional independence assumption). The proposed model includes variable selection, as a special case, and is able to cope with the mixed-data setting. The simplicity of the model allows to estimate the repartition of the variables into blocks and the mixture parameters simultaneously, thus avoiding to run EM algorithms for each possible repartition of variables into blocks. For the proposed method, a model is defined by the number of blocks, the number of clusters inside each block and the repartition of variables into block. Model selection can be done with two information criteria, the BIC and the MICL, for which an efficient optimization is proposed. The performances of the model are investigated on simulated and real data. It is shown that the proposed method gives a rich interpretation of the data set at hand (i.e., analysis of the repartition of the variables into blocks and analysis of the clusters produced by each block of variables).Show less >
Language :
Anglais
Audience :
Internationale
Popular science :
Non
Administrative institution(s) :
CHU Lille
Université de Lille
Université de Lille
Submission date :
2019-12-09T18:20:45Z
2024-03-27T11:42:53Z
2024-03-27T11:42:53Z
Annexes
- 1801.07063
- Open access
- Source du fichier principal
- Access the document