Delexicalized Word Embeddings for Cross-lingual Dependency Parsing

Dehouck, Mathieu; Denis, Pascal

Type de document :

Communication dans un congrès avec actes

DOI :

10.18653/v1/E17-1023

Titre :

Delexicalized Word Embeddings for Cross-lingual Dependency Parsing

Auteur(s) :

Dehouck, Mathieu [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Machine Learning in Information Networks [MAGNET]
Denis, Pascal [Auteur]

Machine Learning in Information Networks [MAGNET]

Titre de la manifestation scientifique :

EACL

Ville :

Valencia

Pays :

Espagne

Date de début de la manifestation scientifique :

2017-04-03

Titre de l’ouvrage :

EACL

Titre de la revue :

EACL 2017

Date de publication :

2017-04

Mot(s)-clé(s) en anglais :

Word Embedding
Dependency Parsing
Cross-Lingual
Representation
Learning

Discipline(s) HAL :

Informatique [cs]/Intelligence artificielle [cs.AI]
Informatique [cs]/Informatique et langage [cs.CL]
Informatique [cs]/Apprentissage [cs.LG]

Résumé en anglais : [en]

This paper presents a new approach to the problem of cross-lingual dependency parsing, aiming at leveraging training data from different source languages to learn a parser in a target language. Specifically , this approach ...
Lire la suite >This paper presents a new approach to the problem of cross-lingual dependency parsing, aiming at leveraging training data from different source languages to learn a parser in a target language. Specifically , this approach first constructs word vector representations that exploit structural (i.e., dependency-based) contexts but only considering the morpho-syntactic information associated with each word and its contexts. These delexicalized word em-beddings, which can be trained on any set of languages and capture features shared across languages, are then used in combination with standard language-specific features to train a lexicalized parser in the target language. We evaluate our approach through experiments on a set of eight different languages that are part the Universal Dependencies Project. Our main results show that using such delexicalized embeddings, either trained in a monolin-gual or multilingual fashion, achieves significant improvements over monolingual baselines.Lire moins >

Langue :

Anglais

Comité de lecture :

Oui

Audience :

Internationale

Vulgarisation :

Non

Collections :

Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189

Source :

Harvested from HAL

Fichiers

https://hal.inria.fr/hal-01590639/document
Accès libre
Accéder au document

https://doi.org/10.18653/v1/e17-1023
Accès libre
Accéder au document

https://hal.inria.fr/hal-01590639/document
Accès libre
Accéder au document

https://hal.inria.fr/hal-01590639/document
Accès libre
Accéder au document

document
Accès libre
Accéder au document

E17-1023.pdf
Accès libre
Accéder au document

e17-1023
Accès libre
Accéder au document

document
Accès libre
Accéder au document

E17-1023.pdf
Accès libre
Accéder au document

Delexicalized Word Embeddings for Cross-lingual ... BibTeX CSV Excel RIS

Fichiers

Delexicalized Word Embeddings for Cross-lingual ...

BibTeX

CSV

Excel

RIS