Vers une analyse des différences ...
Type de document :
Autre communication scientifique (congrès sans actes - poster - séminaire...): Communication dans un congrès avec actes
Titre :
Vers une analyse des différences interlinguistiques entre les genres textuels : étude de cas basée sur les n-grammes et l'analyse factorielle des correspondances
Auteur(s) :
Lefer, Marie-Aude [Auteur]
Université Catholique de Louvain = Catholic University of Louvain [UCL]
Bestgen, Yves [Auteur]
Université Catholique de Louvain = Catholic University of Louvain [UCL]
Grabar, Natalia [Auteur]
Savoirs, Textes, Langage (STL) - UMR 8163 [STL]
Université Catholique de Louvain = Catholic University of Louvain [UCL]
Bestgen, Yves [Auteur]
Université Catholique de Louvain = Catholic University of Louvain [UCL]
Grabar, Natalia [Auteur]
Savoirs, Textes, Langage (STL) - UMR 8163 [STL]
Titre de la manifestation scientifique :
TALN 2016: Traitement Automatique des Langues Naturelles
Ville :
Paris
Pays :
France
Date de début de la manifestation scientifique :
2016-07-04
Mot(s)-clé(s) en anglais :
Comparable Corpora
Correspondence Analysis
N-grams
Genres
Correspondence Analysis
N-grams
Genres
Discipline(s) HAL :
Informatique [cs]/Traitement du texte et du document
Sciences de l'Homme et Société/Linguistique
Sciences de l'Homme et Société/Linguistique
Résumé en anglais : [en]
The aim of the present study is to assess the use of n-grams and Correspondence Analysis (CA) to compare genres in cross-linguistic studies. The study is based on an English-French bilingual corpus made up of original (i.e. ...
Lire la suite >The aim of the present study is to assess the use of n-grams and Correspondence Analysis (CA) to compare genres in cross-linguistic studies. The study is based on an English-French bilingual corpus made up of original (i.e. non-translated) texts, representing three genres: European parliamentary debates, newspaper editorials and academic articles. First, 2- to 4-grams are extracted in each language. Second, the most frequent 1000 n-grams for each n-gram length and in each language are analyzed by means of CA with a view to determining which n-grams are particularly salient in the genres examined. Finally, n-grams are manually classified into a range of categories, such as stance expressions, discourse markers and referential expressions. The results show that the n-gram approach makes it possible to uncover typical features of the three genres investigated, as well as interesting contrasts between English and French.Lire moins >
Lire la suite >The aim of the present study is to assess the use of n-grams and Correspondence Analysis (CA) to compare genres in cross-linguistic studies. The study is based on an English-French bilingual corpus made up of original (i.e. non-translated) texts, representing three genres: European parliamentary debates, newspaper editorials and academic articles. First, 2- to 4-grams are extracted in each language. Second, the most frequent 1000 n-grams for each n-gram length and in each language are analyzed by means of CA with a view to determining which n-grams are particularly salient in the genres examined. Finally, n-grams are manually classified into a range of categories, such as stance expressions, discourse markers and referential expressions. The results show that the n-gram approach makes it possible to uncover typical features of the three genres investigated, as well as interesting contrasts between English and French.Lire moins >
Langue :
Français
Comité de lecture :
Oui
Audience :
Internationale
Vulgarisation :
Non
Collections :
Source :
Fichiers
- https://hal.archives-ouvertes.fr/hal-01426820/document
- Accès libre
- Accéder au document
- https://hal.archives-ouvertes.fr/hal-01426820/document
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- lefer-TALN2016short.pdf
- Accès libre
- Accéder au document