Facing the facts of fake: a distributional semantics and corpus annotation approach

Cappelle, Bert; Denis, Pascal; Keller, Mikaela

Type de document :

Compte-rendu et recension critique d'ouvrage

Titre :

Facing the facts of fake: a distributional semantics and corpus annotation approach

Auteur(s) :

Cappelle, Bert [Auteur]

Savoirs, Textes, Langage (STL) - UMR 8163 [STL]
Denis, Pascal [Auteur]

Machine Learning in Information Networks [MAGNET]
Keller, Mikaela [Auteur]

Machine Learning in Information Networks [MAGNET]

Titre de la revue :

Yearbook of the German Cognitive Linguistics Association

Éditeur :

De Gruyter

Date de publication :

2018-11-28

ISSN :

2197-2796

Mot(s)-clé(s) en anglais :

distributional semantics
bigram
modification
fake
privative adjective
context-sensitivity
word and phrase embedding

Discipline(s) HAL :

Sciences de l'Homme et Société/Sciences de l'information et de la communication

Résumé en anglais : [en]

Fake is often considered the textbook example of a so-called 'privative' adjective, one which, in other words, allows the proposition that '(a) fake x is not (an) x'. This study tests the hypothesis that the contexts of ...
Lire la suite >Fake is often considered the textbook example of a so-called 'privative' adjective, one which, in other words, allows the proposition that '(a) fake x is not (an) x'. This study tests the hypothesis that the contexts of an adjective-noun combination are more different from the contexts of the noun when the adjective is such a 'privative' one than when it is an ordinary (subsective) one. We here use 'embeddings', that is, dense vector representations based on word co-occurrences in a large corpus, which in our study is the entire English Wikipedia as it was in 2013. Comparing the cosine distance between the adjective-noun bigram and single noun embeddings across two sets of adjectives, privative and ordinary ones, we fail to find a noticeable difference. However, we contest that fake is an across-the-board privative adjective, since a fake article, for instance, is most definitely still an article. We extend a recent proposal involving the noun's qualia roles (how an entity is made, what it consists of, what it is used for, etc.) and propose several interpretational types of fake-noun combinations, some but not all of which are privative. These interpretations, which we assign manually to the 100 most frequent fake-noun combinations in the Wikipedia corpus, depend to a large extent on the meaning of the noun, as combinations with similar interpretations tend to involve nouns that are linked in a distributions-based network. When we restrict our focus to the privative uses of fake only, we do detect a slightly enlarged difference between fake + noun bigram and noun distributions compared to the previously obtained average difference between adjective + noun bigram and noun distributions. This result contrasts with negative or even opposite findings reported in the literature.Lire moins >

Langue :

Anglais

Vulgarisation :

Non

Collections :