Model-Based Clustering of Multivariate ...
Type de document :
Compte-rendu et recension critique d'ouvrage
Titre :
Model-Based Clustering of Multivariate Ordinal Data Relying on a Stochastic Binary Search Algorithm
Auteur(s) :
Biernacki, Christophe [Auteur]
MOdel for Data Analysis and Learning [MODAL]
Jacques, Julien [Auteur]
Entrepôts, Représentation et Ingénierie des Connaissances [ERIC]
MOdel for Data Analysis and Learning [MODAL]
MOdel for Data Analysis and Learning [MODAL]
Jacques, Julien [Auteur]
Entrepôts, Représentation et Ingénierie des Connaissances [ERIC]
MOdel for Data Analysis and Learning [MODAL]
Titre de la revue :
Statistics and Computing
Pagination :
929-943
Éditeur :
Springer Verlag (Germany)
Date de publication :
2016
ISSN :
0960-3174
Discipline(s) HAL :
Mathématiques [math]/Statistiques [math.ST]
Statistiques [stat]/Théorie [stat.TH]
Statistiques [stat]/Théorie [stat.TH]
Résumé en anglais : [en]
We design the first univariate probability distribution for ordinal data which strictly respects the ordinal nature of data. More precisely, it relies only on order comparisons between modalities. Contrariwise, most ...
Lire la suite >We design the first univariate probability distribution for ordinal data which strictly respects the ordinal nature of data. More precisely, it relies only on order comparisons between modalities. Contrariwise, most competitors either forget the order information or add a nonexistent distance information. The proposed distribution is obtained by modeling the data generating process which is assumed, from optimality arguments, to be a stochastic binary search algorithm in a sorted table. The resulting distribution is natively governed by two meaningful parameters (position and precision) and has very appealing properties: decrease around the mode, shape tuning from uniformity to a Dirac, identifiability. Moreover, it is easily estimated by an EM algorithm since the path in the stochastic binary search algorithm is missing. Using then the classical latent class assumption, the previous univariate ordinal model is straightforwardly extended to model-based clustering for multivariate ordinal data. Again, parameters of this mixture model are estimated by an EM algorithm. Both simulated and real data sets illustrate the great potential of this model by its ability to parsimoniously identify particularly relevant clusters which were unsuspected by some traditional competitors.Lire moins >
Lire la suite >We design the first univariate probability distribution for ordinal data which strictly respects the ordinal nature of data. More precisely, it relies only on order comparisons between modalities. Contrariwise, most competitors either forget the order information or add a nonexistent distance information. The proposed distribution is obtained by modeling the data generating process which is assumed, from optimality arguments, to be a stochastic binary search algorithm in a sorted table. The resulting distribution is natively governed by two meaningful parameters (position and precision) and has very appealing properties: decrease around the mode, shape tuning from uniformity to a Dirac, identifiability. Moreover, it is easily estimated by an EM algorithm since the path in the stochastic binary search algorithm is missing. Using then the classical latent class assumption, the previous univariate ordinal model is straightforwardly extended to model-based clustering for multivariate ordinal data. Again, parameters of this mixture model are estimated by an EM algorithm. Both simulated and real data sets illustrate the great potential of this model by its ability to parsimoniously identify particularly relevant clusters which were unsuspected by some traditional competitors.Lire moins >
Langue :
Anglais
Vulgarisation :
Non
Collections :
Source :
Fichiers
- document
- Accès libre
- Accéder au document
- Biernacki-Jacques-OrdinalDataModel.pdf
- Accès libre
- Accéder au document
- Biernacki-Jacques-OrdinalDataModel.pdf
- Accès libre
- Accéder au document