On Markov chain Monte Carlo methods for tall data
Type de document :
Compte-rendu et recension critique d'ouvrage
Titre :
On Markov chain Monte Carlo methods for tall data
Auteur(s) :
Bardenet, Remi [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Centre National de la Recherche Scientifique [CNRS]
Doucet, Arnaud [Auteur]
Department of Statistics [Oxford]
Holmes, Chris [Auteur]
Department of Statistics [Oxford]

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Centre National de la Recherche Scientifique [CNRS]
Doucet, Arnaud [Auteur]
Department of Statistics [Oxford]
Holmes, Chris [Auteur]
Department of Statistics [Oxford]
Titre de la revue :
Journal of Machine Learning Research
Éditeur :
Microtome Publishing
Date de publication :
2017-05
ISSN :
1532-4435
Discipline(s) HAL :
Mathématiques [math]/Statistiques [math.ST]
Résumé en anglais : [en]
</i> Markov chain Monte Carlo methods are often deemed too computationally intensive to be of any practical use for big data applications, and in particular for inference on datasets containing a large number n of individual ...
Lire la suite ></i> Markov chain Monte Carlo methods are often deemed too computationally intensive to be of any practical use for big data applications, and in particular for inference on datasets containing a large number n of individual data points, also known as tall datasets. In scenarios where data are assumed independent, various approaches to scale up the Metropolis-Hastings algorithm in a Bayesian inference context have been recently proposed in machine learning and computational statistics. These approaches can be grouped into two categories: divide-and-conquer approaches and, subsampling-based algorithms. The aims of this article are as follows. First, we present a comprehensive review of the existing literature, commenting on the underlying assumptions and theoretical guarantees of each method. Second, by leveraging our understanding of these limitations, we propose an original subsampling-based approach which samples from a distribution provably close to the posterior distribution of interest, yet can require less than O(n) data point likelihood evaluations at each iteration for certain statistical models in favourable scenarios. Finally, we have only been able so far to propose subsampling-based methods which display good performance in scenarios where the Bernstein-von Mises approximation of the target posterior distribution is excellent. It remains an open challenge to develop such methods in scenarios where the Bernstein-von Mises approximation is poor.Lire moins >
Lire la suite ></i> Markov chain Monte Carlo methods are often deemed too computationally intensive to be of any practical use for big data applications, and in particular for inference on datasets containing a large number n of individual data points, also known as tall datasets. In scenarios where data are assumed independent, various approaches to scale up the Metropolis-Hastings algorithm in a Bayesian inference context have been recently proposed in machine learning and computational statistics. These approaches can be grouped into two categories: divide-and-conquer approaches and, subsampling-based algorithms. The aims of this article are as follows. First, we present a comprehensive review of the existing literature, commenting on the underlying assumptions and theoretical guarantees of each method. Second, by leveraging our understanding of these limitations, we propose an original subsampling-based approach which samples from a distribution provably close to the posterior distribution of interest, yet can require less than O(n) data point likelihood evaluations at each iteration for certain statistical models in favourable scenarios. Finally, we have only been able so far to propose subsampling-based methods which display good performance in scenarios where the Bernstein-von Mises approximation of the target posterior distribution is excellent. It remains an open challenge to develop such methods in scenarios where the Bernstein-von Mises approximation is poor.Lire moins >
Langue :
Anglais
Vulgarisation :
Non
Collections :
Source :
Fichiers
- https://hal.archives-ouvertes.fr/hal-01355287/document
- Accès libre
- Accéder au document
- https://hal.archives-ouvertes.fr/hal-01355287/document
- Accès libre
- Accéder au document
- https://hal.archives-ouvertes.fr/hal-01355287/document
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- 1505.02827v1.pdf
- Accès libre
- Accéder au document