• English
    • français
  • Help
  •  | 
  • Contact
  •  | 
  • About
  •  | 
  • Login
  • HAL portal
  •  | 
  • Pages Pro
  • EN
  •  / 
  • FR
View Item 
  •   LillOA Home
  • Liste des unités
  • Savoirs, Textes, Langage (STL) - UMR 8163
  • View Item
  •   LillOA Home
  • Liste des unités
  • Savoirs, Textes, Langage (STL) - UMR 8163
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Uncovering Machine Translationese Using ...
  • BibTeX
  • CSV
  • Excel
  • RIS

Document type :
Article dans une revue scientifique
Title :
Uncovering Machine Translationese Using Corpus Analysis Techniques to Distinguish between Original and Machine­-Translated French
Author(s) :
de Clercq, Orphée [Auteur]
de Sutter, Gert [Auteur]
Loock, Rudy [Auteur] refId
Savoirs, Textes, Langage (STL) - UMR 8163 [STL]
Cappelle, Bert [Auteur] refId
Plevoets, Koen [Auteur]
Journal title :
Translation Quarterly
Pages :
21-45
Publisher :
The Hong Kong Translation Society
Publication date :
2021
ISSN :
1027-8559
English keyword(s) :
Machine translation
Corpus-based translation studies
Translation
Translation quality
Corpus linguistics
Machine-translationese
HAL domain(s) :
Sciences de l'Homme et Société/Linguistique
English abstract : [en]
This paper investigates the linguistic characteristics of English to French machine­-translatedtexts in comparison with French original, untranslated texts in order to uncover what has been called “machine translationese”. ...
Show more >
This paper investigates the linguistic characteristics of English to French machine­-translatedtexts in comparison with French original, untranslated texts in order to uncover what has been called “machine translationese”. In the same vein as corpus­-based translation studies which have focused on human­-translated texts, and using a corpus­-based statistical approach (Principal Component Analysis), we analyzed a ca. 1.8­-million­-word corpus of English to French translations of press texts, corresponding to the output of four machine translation sy­stems: one statistical (SMT) and three neural (NMT) systems, namely DeepL, Google Trans­late, and the European Commission’s eTranslation MT tool, in both its SMT and NMT ver­sions. In particular, to complement a previous study on language­-specific features in French(e.g. derived adverbs, existential constructions, coordinator et, preposition avec), a series of language­-independent linguistic features were extracted for each text in our corpus, ranging from superficial text characteristics such as average word and sentence length to frequencies of closed­ class lexical categories and measures of lexical diversity. Our results, which compare the machine­-translated data with a corpus of French untranslated data, allow us to uncoverlinguistic features in French machine­-translated texts that clearly deviate from the observed norms in original French (e.g.average sentence length, n­gram features, lexicaldiversity), and which might serve as information for the post­-diting process in order to optimize translation quality.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
Collections :
  • Savoirs, Textes, Langage (STL) - UMR 8163
Source :
Harvested from HAL
Université de Lille

Mentions légales
Université de Lille © 2017