Sparse Factor Analysis for Categorical ...
Type de document :
Pré-publication ou Document de travail
URL permanente :
Titre :
Sparse Factor Analysis for Categorical Data with the Group-Sparse Generalized Singular Value Decomposition
Auteur(s) :
Yu, Ju-Chi [Auteur correspondant]
Centre for Addiction and Mental Health [Toronto] [CAMH]
Le Borgne, Julie [Auteur]
Facteurs de Risque et Déterminants Moléculaires des Maladies liées au Vieillissement - U 1167 [RID-AGE]
Krishnan, Anjali [Auteur]
City University of New York [New York] [CUNY]
Gloaguen, Arnaud [Auteur]
Centre National de Recherche en Génomique Humaine [CNRGH]
Yang, Cheng-Ta [Auteur]
Taipei Medical University
National Cheng Kung University [NCKU]
Rabin, Laura [Auteur]
City University of New York [New York] [CUNY]
Abdi, Hervé [Auteur correspondant]
University of Texas at Dallas [Richardson] [UT Dallas]
Guillemot, Vincent [Auteur correspondant]
Hub Bioinformatique et Biostatistique - Bioinformatics and Biostatistics HUB
Centre for Addiction and Mental Health [Toronto] [CAMH]
Le Borgne, Julie [Auteur]
Facteurs de Risque et Déterminants Moléculaires des Maladies liées au Vieillissement - U 1167 [RID-AGE]
Krishnan, Anjali [Auteur]
City University of New York [New York] [CUNY]
Gloaguen, Arnaud [Auteur]
Centre National de Recherche en Génomique Humaine [CNRGH]
Yang, Cheng-Ta [Auteur]
Taipei Medical University
National Cheng Kung University [NCKU]
Rabin, Laura [Auteur]
City University of New York [New York] [CUNY]
Abdi, Hervé [Auteur correspondant]
University of Texas at Dallas [Richardson] [UT Dallas]
Guillemot, Vincent [Auteur correspondant]
Hub Bioinformatique et Biostatistique - Bioinformatics and Biostatistics HUB
Mot(s)-clé(s) en anglais :
Sparsification Multivariate Analysis Correspondence Analysis Discriminant Correspondence Analysis
Sparsification
Multivariate Analysis
Correspondence Analysis
Discriminant Correspondence Analysis
Sparsification
Multivariate Analysis
Correspondence Analysis
Discriminant Correspondence Analysis
Discipline(s) HAL :
Statistiques [stat]/Théorie [stat.TH]
Résumé en anglais : [en]
Correspondence analysis, multiple correspondence analysis and their discriminant counterparts (i.e., discriminant simple correspondence analysis and discriminant multiple correspondence analysis) are methods of choice for ...
Lire la suite >Correspondence analysis, multiple correspondence analysis and their discriminant counterparts (i.e., discriminant simple correspondence analysis and discriminant multiple correspondence analysis) are methods of choice for analyzing multivariate categorical data. In these methods, variables are integrated into optimal components computed as linear combinations whose weights are obtained from a generalized singular value decomposition (GSVD) that integrates specific metric constraints on the rows and columns of the original data matrix. The weights of the linear combinations are, in turn, used to interpret the components, and this interpretation is facilitated when components are 1) pairwise orthogonal and 2) when the values of the weights are either large or small but not intermediate-a pattern called a simple or a sparse structure. To obtain such simple configurations, the optimization problem solved by the GSVD is extended to include new constraints that implement component orthogonality and sparse weights. Because multiple correspondence analysis represents qualitative variables by a set of binary variables, an additional group constraint is added to the optimization problem in order to sparsify the whole set representing one qualitative variable. This new algorithm-called group-sparse GSVD (gsGSVD)-integrates these constraints via an iterative projection scheme onto the intersection of subspaces where each subspace implements a specific constraint. In this paper, we expose this new algorithm and show how it can be adapted to the sparsification of simple and multiple correspondence analysis, and illustrate its applications with the analysis of four different data sets-each illustrating the sparsification of a particular CA-based analysis.Lire moins >
Lire la suite >Correspondence analysis, multiple correspondence analysis and their discriminant counterparts (i.e., discriminant simple correspondence analysis and discriminant multiple correspondence analysis) are methods of choice for analyzing multivariate categorical data. In these methods, variables are integrated into optimal components computed as linear combinations whose weights are obtained from a generalized singular value decomposition (GSVD) that integrates specific metric constraints on the rows and columns of the original data matrix. The weights of the linear combinations are, in turn, used to interpret the components, and this interpretation is facilitated when components are 1) pairwise orthogonal and 2) when the values of the weights are either large or small but not intermediate-a pattern called a simple or a sparse structure. To obtain such simple configurations, the optimization problem solved by the GSVD is extended to include new constraints that implement component orthogonality and sparse weights. Because multiple correspondence analysis represents qualitative variables by a set of binary variables, an additional group constraint is added to the optimization problem in order to sparsify the whole set representing one qualitative variable. This new algorithm-called group-sparse GSVD (gsGSVD)-integrates these constraints via an iterative projection scheme onto the intersection of subspaces where each subspace implements a specific constraint. In this paper, we expose this new algorithm and show how it can be adapted to the sparsification of simple and multiple correspondence analysis, and illustrate its applications with the analysis of four different data sets-each illustrating the sparsification of a particular CA-based analysis.Lire moins >
Langue :
Anglais
Collections :
Source :
Date de dépôt :
2024-09-12T05:01:31Z
Fichiers
- document
- Accès libre
- Accéder au document
- Sparse_MCA_2020__CSDA_%20%281%29.pdf
- Accès libre
- Accéder au document