Probing for Bridging Inference in Transformer ...
Type de document :
Communication dans un congrès avec actes
Titre :
Probing for Bridging Inference in Transformer Language Models
Auteur(s) :
Pandit, Onkar [Auteur]
Machine Learning in Information Networks [MAGNET]
Hou, Yufang [Auteur]
IBM [DUBLIN] [IBM]
Machine Learning in Information Networks [MAGNET]
Hou, Yufang [Auteur]
IBM [DUBLIN] [IBM]
Titre de la manifestation scientifique :
NAACL 2021 - Annual Conference of the North American Chapter of the Association for Computational Linguistics
Ville :
Online Conference
Pays :
Mexique
Date de début de la manifestation scientifique :
2021-06-06
Date de publication :
2021-06-06
Discipline(s) HAL :
Informatique [cs]/Intelligence artificielle [cs.AI]
Informatique [cs]
Informatique [cs]
Résumé en anglais : [en]
We probe pre-trained transformer language models for bridging inference. We first investigate individual attention heads in BERT and observe that attention heads at higher layers prominently focus on bridging relations ...
Lire la suite >We probe pre-trained transformer language models for bridging inference. We first investigate individual attention heads in BERT and observe that attention heads at higher layers prominently focus on bridging relations incomparison with the lower and middle layers, also, few specific attention heads concentrate consistently on bridging. More importantly, we consider language models as a whole in our second approach where bridging anaphora resolution is formulated as a masked token prediction task (Of-Cloze test). Our formulation produces optimistic results without any finetuning, which indicates that pre-trained language models substantially capture bridging inference. Our further investigation shows that the distance between anaphor-antecedent and the context provided to language models play an important role in the inference.Lire moins >
Lire la suite >We probe pre-trained transformer language models for bridging inference. We first investigate individual attention heads in BERT and observe that attention heads at higher layers prominently focus on bridging relations incomparison with the lower and middle layers, also, few specific attention heads concentrate consistently on bridging. More importantly, we consider language models as a whole in our second approach where bridging anaphora resolution is formulated as a masked token prediction task (Of-Cloze test). Our formulation produces optimistic results without any finetuning, which indicates that pre-trained language models substantially capture bridging inference. Our further investigation shows that the distance between anaphor-antecedent and the context provided to language models play an important role in the inference.Lire moins >
Langue :
Anglais
Comité de lecture :
Oui
Audience :
Internationale
Vulgarisation :
Non
Projet ANR :
Collections :
Source :
Fichiers
- https://hal.inria.fr/hal-03284110/document
- Accès libre
- Accéder au document
- https://hal.inria.fr/hal-03284110/document
- Accès libre
- Accéder au document
- https://hal.inria.fr/hal-03284110/document
- Accès libre
- Accéder au document
- BridgingProbingBERT.pdf
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document