Model-based clustering with mixed\/missing data using the new software MixtComp
Classification à base de modèles pour données mixtes et manquantes avec le nouveau logiciel MixtComp
Type de document :
Autre communication scientifique (congrès sans actes - poster - séminaire...): Communication dans un congrès sans actes
URL permanente :
Titre :
Model-based clustering with mixed\/missing data using the new software MixtComp
Classification à base de modèles pour données mixtes et manquantes avec le nouveau logiciel MixtComp
Classification à base de modèles pour données mixtes et manquantes avec le nouveau logiciel MixtComp
Auteur(s) :
Titre de la manifestation scientifique :
CMStatistics 2015 (ERCIM 2015)
Ville :
London
Pays :
Royaume-Uni
Date de début de la manifestation scientifique :
2015-12-12
Date de publication :
2015
Discipline(s) HAL :
Statistiques [stat]/Méthodologie [stat.ME]
Résumé en anglais : [en]
The ``Big Data'' paradigm involves large and complex data sets where the clustering task plays a central role for data exploration. For this purpose, model-based clustering has demonstrated many theoretical and practical ...
Lire la suite >The ``Big Data'' paradigm involves large and complex data sets where the clustering task plays a central role for data exploration. For this purpose, model-based clustering has demonstrated many theoretical and practical successes in a various number of fields. MixtComp is a new software, written in C++, implementing model-based clustering for multivariate missing\/binned\/mixed data under the conditional independence assumption. Current implemented mixed data are continuous (Gaussian), categorical (multinomial), integer (Poisson) and ordinal (specific model) ones. However, architecture of MixtComp is designed for incremental insertion of new kinds of data (ranks, functional, directional...) and related models. Model estimation is performed by a Stochastic EM algorithm (SEM) and several classical model selection criteria are available (BIC, ICL). Currently, MixtComp is not freely provided as an R package but is freely available through a specific user-friendly web interface (https:\/\/modal-research.lille.inria.fr\/BigStat\/) and its output corresponds to an R object directly usable in the R environment. Beyond its clustering task, it also allows us to perform imputation of missing\/binned data (with associated confidence intervals) by using the mixture model ability for density estimation as well.Lire moins >
Lire la suite >The ``Big Data'' paradigm involves large and complex data sets where the clustering task plays a central role for data exploration. For this purpose, model-based clustering has demonstrated many theoretical and practical successes in a various number of fields. MixtComp is a new software, written in C++, implementing model-based clustering for multivariate missing\/binned\/mixed data under the conditional independence assumption. Current implemented mixed data are continuous (Gaussian), categorical (multinomial), integer (Poisson) and ordinal (specific model) ones. However, architecture of MixtComp is designed for incremental insertion of new kinds of data (ranks, functional, directional...) and related models. Model estimation is performed by a Stochastic EM algorithm (SEM) and several classical model selection criteria are available (BIC, ICL). Currently, MixtComp is not freely provided as an R package but is freely available through a specific user-friendly web interface (https:\/\/modal-research.lille.inria.fr\/BigStat\/) and its output corresponds to an R object directly usable in the R environment. Beyond its clustering task, it also allows us to perform imputation of missing\/binned data (with associated confidence intervals) by using the mixture model ability for density estimation as well.Lire moins >
Langue :
Anglais
Audience :
Internationale
Vulgarisation :
Non
Date de dépôt :
2020-06-08T14:11:11Z