A Probabilistic Model for Joint Learning of Word Embeddings from Texts and Images

Ailem, Melissa; Zhang, Bowen; Bellet, Aurelien; Denis, Pascal; Sha, Fei

Type de document :

Communication dans un congrès avec actes

Titre :

A Probabilistic Model for Joint Learning of Word Embeddings from Texts and Images

Auteur(s) :

Ailem, Melissa [Auteur]
Machine Learning in Information Networks [MAGNET]
USC Viterbi School of Engineering
Zhang, Bowen [Auteur]
USC Viterbi School of Engineering
Bellet, Aurelien [Auteur]

Machine Learning in Information Networks [MAGNET]
Denis, Pascal [Auteur]

Machine Learning in Information Networks [MAGNET]
Sha, Fei [Auteur]
USC Viterbi School of Engineering

Titre de la manifestation scientifique :

Conference on Empirical Methods in Natural Language Processing (EMNLP 2018)

Ville :

Brussels

Pays :

Belgique

Date de début de la manifestation scientifique :

2018

Discipline(s) HAL :

Informatique [cs]/Apprentissage [cs.LG]
Statistiques [stat]/Machine Learning [stat.ML]

Résumé en anglais : [en]

Several recent studies have shown the benefits of combining language and perception to infer word embeddings. These multimodal approaches either simply combine pre-trained textual and visual representations (e.g. features ...
Lire la suite >Several recent studies have shown the benefits of combining language and perception to infer word embeddings. These multimodal approaches either simply combine pre-trained textual and visual representations (e.g. features extracted from convolutional neural networks), or use the latter to bias the learning of textual word embeddings. In this work, we propose a novel probabilistic model to formalize how linguistic and perceptual inputs can work in concert to explain the observed word-context pairs in a text corpus. Our approach learns textual and visual representations jointly: latent visual factors couple together a skip-gram model for co-occurrence in linguistic data and a generative latent variable model for visual data. Extensive experimental studies validate the proposed model. Concretely, on the tasks of assessing pairwise word similarity and image/caption retrieval, our approach attains equally competitive or stronger results when compared to other state-of-the-art multimodal models.Lire moins >

Langue :

Anglais

Comité de lecture :

Oui

Audience :

Internationale

Vulgarisation :

Non

Collections :

Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189

Source :

Harvested from HAL

Fichiers

https://hal.inria.fr/hal-01922985/document
Accès libre
Accéder au document

https://hal.inria.fr/hal-01922985/document
Accès libre
Accéder au document

https://hal.inria.fr/hal-01922985/document
Accès libre
Accéder au document

document
Accès libre
Accéder au document

emnlp18.pdf
Accès libre
Accéder au document

document
Accès libre
Accéder au document

emnlp18.pdf
Accès libre
Accéder au document

A Probabilistic Model for Joint Learning ... BibTeX CSV Excel RIS

Fichiers

A Probabilistic Model for Joint Learning ...

BibTeX

CSV

Excel

RIS