ordinalClust: An R Package to Analyze Ordinal Data
Type de document :
Compte-rendu et recension critique d'ouvrage
DOI :
Titre :
ordinalClust: An R Package to Analyze Ordinal Data
Auteur(s) :
Selosse, Margot [Auteur]
Entrepôts, Représentation et Ingénierie des Connaissances [ERIC]
Jacques, Julien [Auteur]
Entrepôts, Représentation et Ingénierie des Connaissances [ERIC]
Biernacki, Christophe [Auteur]
MOdel for Data Analysis and Learning [MODAL]
Entrepôts, Représentation et Ingénierie des Connaissances [ERIC]
Jacques, Julien [Auteur]
Entrepôts, Représentation et Ingénierie des Connaissances [ERIC]
Biernacki, Christophe [Auteur]
MOdel for Data Analysis and Learning [MODAL]
Titre de la revue :
The R Journal
Éditeur :
R Foundation for Statistical Computing
Date de publication :
2021-01-14
Discipline(s) HAL :
Mathématiques [math]/Statistiques [math.ST]
Résumé en anglais : [en]
Ordinal data are used in many domains, especially when measurements are collected from people through observations, tests, or questionnaires. ordinalClust is an innovative R package dedicated to ordinal data that provides ...
Lire la suite >Ordinal data are used in many domains, especially when measurements are collected from people through observations, tests, or questionnaires. ordinalClust is an innovative R package dedicated to ordinal data that provides tools for modeling, clustering, co-clustering and classifying such data. Ordinal data are modeled using the BOS distribution, which is a model with two meaningful parameters referred to as "position" and "precision". The former indicates the mode of the distribution and the latter describes how scattered the data are around the mode: the user is able to easily interpret the distribution of their data when given these two parameters. The package is based on the coclustering framework (when rows and columns are simultaneously clustered). The co-clustering approach uses the Latent Block Model (LBM) and the SEM-Gibbs algorithm for parameter inference. On the other hand, the clustering and the classification methods follow on from simplified versions of the SEM-Gibbs algorithm. For the classification process, two approaches are proposed. In the first one, the BOS parameters are estimated from the training dataset in the conventional way. In the second approach, parsimony is introduced by estimating the parameters and column-clusters from the training dataset. We empirically show that this approach can yield better results. For the clustering and co-clustering processes, the ICL-BIC criterion is used for model selection purposes. An overview of these methods is given, and the way to use them with the ordinalClust package is described using real datasets. The latest stable package version is available on the Comprehensive R Archive Network (CRAN).Lire moins >
Lire la suite >Ordinal data are used in many domains, especially when measurements are collected from people through observations, tests, or questionnaires. ordinalClust is an innovative R package dedicated to ordinal data that provides tools for modeling, clustering, co-clustering and classifying such data. Ordinal data are modeled using the BOS distribution, which is a model with two meaningful parameters referred to as "position" and "precision". The former indicates the mode of the distribution and the latter describes how scattered the data are around the mode: the user is able to easily interpret the distribution of their data when given these two parameters. The package is based on the coclustering framework (when rows and columns are simultaneously clustered). The co-clustering approach uses the Latent Block Model (LBM) and the SEM-Gibbs algorithm for parameter inference. On the other hand, the clustering and the classification methods follow on from simplified versions of the SEM-Gibbs algorithm. For the classification process, two approaches are proposed. In the first one, the BOS parameters are estimated from the training dataset in the conventional way. In the second approach, parsimony is introduced by estimating the parameters and column-clusters from the training dataset. We empirically show that this approach can yield better results. For the clustering and co-clustering processes, the ICL-BIC criterion is used for model selection purposes. An overview of these methods is given, and the way to use them with the ordinalClust package is described using real datasets. The latest stable package version is available on the Comprehensive R Archive Network (CRAN).Lire moins >
Langue :
Anglais
Vulgarisation :
Non
Collections :
Source :
Fichiers
- document
- Accès libre
- Accéder au document
- RJwrapper.pdf
- Accès libre
- Accéder au document
- rj-2021-011
- Accès libre
- Accéder au document