FiLM: Visual Reasoning with a General ...
Document type :
Communication dans un congrès avec actes
Title :
FiLM: Visual Reasoning with a General Conditioning Layer
Author(s) :
Perez, Ethan [Auteur]
Université de Montréal [UdeM]
Rice University [Houston]
Strub, Florian [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Sequential Learning [SEQUEL]
de Vries, Harm [Auteur]
Université de Montréal [UdeM]
Dumoulin, Vincent [Auteur]
Université de Montréal [UdeM]
Courville, Aaron [Auteur]
Université de Montréal [UdeM]
Université de Montréal [UdeM]
Rice University [Houston]
Strub, Florian [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Sequential Learning [SEQUEL]
de Vries, Harm [Auteur]
Université de Montréal [UdeM]
Dumoulin, Vincent [Auteur]
Université de Montréal [UdeM]
Courville, Aaron [Auteur]
Université de Montréal [UdeM]
Conference title :
AAAI Conference on Artificial Intelligence
City :
New Orleans
Country :
Etats-Unis d'Amérique
Start date of the conference :
2018-02-02
HAL domain(s) :
Informatique [cs]/Réseau de neurones [cs.NE]
Informatique [cs]/Intelligence artificielle [cs.AI]
Informatique [cs]/Intelligence artificielle [cs.AI]
English abstract : [en]
We introduce a general-purpose conditioning method for neu-ral networks called FiLM: Feature-wise Linear Modulation. FiLM layers influence neural network computation via a simple , feature-wise affine transformation based ...
Show more >We introduce a general-purpose conditioning method for neu-ral networks called FiLM: Feature-wise Linear Modulation. FiLM layers influence neural network computation via a simple , feature-wise affine transformation based on conditioning information. We show that FiLM layers are highly effective for visual reasoning — answering image-related questions which require a multi-step, high-level process — a task which has proven difficult for standard deep learning methods that do not explicitly model reasoning. Specifically, we show on visual reasoning tasks that FiLM layers 1) halve state-of-the-art error for the CLEVR benchmark, 2) modulate features in a coherent manner, 3) are robust to ablations and architectural modifications, and 4) generalize well to challenging, new data from few examples or even zero-shot.Show less >
Show more >We introduce a general-purpose conditioning method for neu-ral networks called FiLM: Feature-wise Linear Modulation. FiLM layers influence neural network computation via a simple , feature-wise affine transformation based on conditioning information. We show that FiLM layers are highly effective for visual reasoning — answering image-related questions which require a multi-step, high-level process — a task which has proven difficult for standard deep learning methods that do not explicitly model reasoning. Specifically, we show on visual reasoning tasks that FiLM layers 1) halve state-of-the-art error for the CLEVR benchmark, 2) modulate features in a coherent manner, 3) are robust to ablations and architectural modifications, and 4) generalize well to challenging, new data from few examples or even zero-shot.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
Collections :
Source :
Files
- http://arxiv.org/pdf/1707.03017
- Open access
- Access the document
- 1707.03017
- Open access
- Access the document