Dynamical-VAE-based Hindsight to Learn the ...
Type de document :
Pré-publication ou Document de travail
Titre :
Dynamical-VAE-based Hindsight to Learn the Causal Dynamics of Factored-POMDPs
Auteur(s) :
Han, Chao [Auteur]
University Hospital LMU Munich
Basu, Debabrota [Auteur]
Centrale Lille
Université de Lille
Inria Lille - Nord Europe
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Scool [Scool]
Mangan, Michael [Auteur]
University of Sheffield [Sheffield]
Vasilaki, Eleni [Auteur]
University of Sheffield [Sheffield]
Gilra, Aditya [Auteur]
Centrum Wiskunde & Informatica [CWI]
University Hospital LMU Munich
Basu, Debabrota [Auteur]
Centrale Lille
Université de Lille
Inria Lille - Nord Europe
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Scool [Scool]
Mangan, Michael [Auteur]
University of Sheffield [Sheffield]
Vasilaki, Eleni [Auteur]
University of Sheffield [Sheffield]
Gilra, Aditya [Auteur]
Centrum Wiskunde & Informatica [CWI]
Date de publication :
2024-11-12
Mot(s)-clé(s) en anglais :
Reinforcement Leaning RL
Partially observable Markov decision process POMDP
Factored Partially Observable Markov Decision Process FPOMDP
Causal Inference
Variation autoencoder
Causal structure learning
Dynamical system
Partially observable Markov decision process POMDP
Factored Partially Observable Markov Decision Process FPOMDP
Causal Inference
Variation autoencoder
Causal structure learning
Dynamical system
Discipline(s) HAL :
Informatique [cs]/Intelligence artificielle [cs.AI]
Informatique [cs]/Apprentissage [cs.LG]
Informatique [cs]/Systèmes et contrôle [cs.SY]
Mathématiques [math]/Systèmes dynamiques [math.DS]
Informatique [cs]/Apprentissage [cs.LG]
Informatique [cs]/Systèmes et contrôle [cs.SY]
Mathématiques [math]/Systèmes dynamiques [math.DS]
Résumé en anglais : [en]
Learning representations of underlying environmental dynamics from partial observations is a critical challenge in machine learning. In the context of Partially Observable Markov Decision Processes (POMDPs), state ...
Lire la suite >Learning representations of underlying environmental dynamics from partial observations is a critical challenge in machine learning. In the context of Partially Observable Markov Decision Processes (POMDPs), state representations are often inferred from the history of past observations and actions. We demonstrate that incorporating future information is essential to accurately capture causal dynamics and enhance state representations. To address this, we introduce a Dynamical Variational Auto-Encoder (DVAE) designed to learn causal Markovian dynamics from offline trajectories in a POMDP. Our method employs an extended hindsight framework that integrates past, current, and multi-step future information within a factored-POMDP setting. Empirical results reveal that this approach uncovers the causal graph governing hidden state transitions more effectively than history-based and typical hindsight-based models.Lire moins >
Lire la suite >Learning representations of underlying environmental dynamics from partial observations is a critical challenge in machine learning. In the context of Partially Observable Markov Decision Processes (POMDPs), state representations are often inferred from the history of past observations and actions. We demonstrate that incorporating future information is essential to accurately capture causal dynamics and enhance state representations. To address this, we introduce a Dynamical Variational Auto-Encoder (DVAE) designed to learn causal Markovian dynamics from offline trajectories in a POMDP. Our method employs an extended hindsight framework that integrates past, current, and multi-step future information within a factored-POMDP setting. Empirical results reveal that this approach uncovers the causal graph governing hidden state transitions more effectively than history-based and typical hindsight-based models.Lire moins >
Langue :
Anglais
Projet ANR :
Collections :
Source :
Fichiers
- 2411.07832
- Accès libre
- Accéder au document