Vers une analyse des différences ...
Document type :
Autre communication scientifique (congrès sans actes - poster - séminaire...): Communication dans un congrès avec actes
Title :
Vers une analyse des différences interlinguistiques entre les genres textuels : étude de cas basée sur les n-grammes et l'analyse factorielle des correspondances
Author(s) :
Lefer, Marie-Aude [Auteur]
Université Catholique de Louvain = Catholic University of Louvain [UCL]
Bestgen, Yves [Auteur]
Université Catholique de Louvain = Catholic University of Louvain [UCL]
Grabar, Natalia [Auteur]
Savoirs, Textes, Langage (STL) - UMR 8163 [STL]
Université Catholique de Louvain = Catholic University of Louvain [UCL]
Bestgen, Yves [Auteur]
Université Catholique de Louvain = Catholic University of Louvain [UCL]
Grabar, Natalia [Auteur]
Savoirs, Textes, Langage (STL) - UMR 8163 [STL]
Conference title :
TALN 2016: Traitement Automatique des Langues Naturelles
City :
Paris
Country :
France
Start date of the conference :
2016-07-04
English keyword(s) :
Comparable Corpora
Correspondence Analysis
N-grams
Genres
Correspondence Analysis
N-grams
Genres
HAL domain(s) :
Informatique [cs]/Traitement du texte et du document
Sciences de l'Homme et Société/Linguistique
Sciences de l'Homme et Société/Linguistique
English abstract : [en]
The aim of the present study is to assess the use of n-grams and Correspondence Analysis (CA) to compare genres in cross-linguistic studies. The study is based on an English-French bilingual corpus made up of original (i.e. ...
Show more >The aim of the present study is to assess the use of n-grams and Correspondence Analysis (CA) to compare genres in cross-linguistic studies. The study is based on an English-French bilingual corpus made up of original (i.e. non-translated) texts, representing three genres: European parliamentary debates, newspaper editorials and academic articles. First, 2- to 4-grams are extracted in each language. Second, the most frequent 1000 n-grams for each n-gram length and in each language are analyzed by means of CA with a view to determining which n-grams are particularly salient in the genres examined. Finally, n-grams are manually classified into a range of categories, such as stance expressions, discourse markers and referential expressions. The results show that the n-gram approach makes it possible to uncover typical features of the three genres investigated, as well as interesting contrasts between English and French.Show less >
Show more >The aim of the present study is to assess the use of n-grams and Correspondence Analysis (CA) to compare genres in cross-linguistic studies. The study is based on an English-French bilingual corpus made up of original (i.e. non-translated) texts, representing three genres: European parliamentary debates, newspaper editorials and academic articles. First, 2- to 4-grams are extracted in each language. Second, the most frequent 1000 n-grams for each n-gram length and in each language are analyzed by means of CA with a view to determining which n-grams are particularly salient in the genres examined. Finally, n-grams are manually classified into a range of categories, such as stance expressions, discourse markers and referential expressions. The results show that the n-gram approach makes it possible to uncover typical features of the three genres investigated, as well as interesting contrasts between English and French.Show less >
Language :
Français
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
Collections :
Source :
Files
- https://hal.archives-ouvertes.fr/hal-01426820/document
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-01426820/document
- Open access
- Access the document
- document
- Open access
- Access the document
- lefer-TALN2016short.pdf
- Open access
- Access the document