Interpreting Neural Networks as Majority ...
Type de document :
Autre communication scientifique (congrès sans actes - poster - séminaire...): Communication dans un congrès sans actes
URL permanente :
Titre :
Interpreting Neural Networks as Majority Votes through the PAC-Bayesian Theory
Auteur(s) :
Viallard, Paul [Auteur]
Emonet, Rémi [Auteur]
Germain, Pascal [Auteur]
Habrard, Amaury [Auteur]
Morvant, Emilie [Auteur]
Emonet, Rémi [Auteur]
Germain, Pascal [Auteur]
Habrard, Amaury [Auteur]
Morvant, Emilie [Auteur]
Titre de la manifestation scientifique :
Workshop on Machine Learning with guarantees @ NeurIPS 2019
Ville :
Vancouver
Pays :
Canada
Date de début de la manifestation scientifique :
2019-12-14
Date de publication :
2019
Discipline(s) HAL :
Statistiques [stat]/Machine Learning [stat.ML]
Résumé en anglais : [en]
We propose a PAC-Bayesian theoretical study of the two-phase learning procedure of a neural network introduced by Kawaguchi et al. (2017). In this procedure, a network is expressed as a weighted combination of all the paths ...
Lire la suite >We propose a PAC-Bayesian theoretical study of the two-phase learning procedure of a neural network introduced by Kawaguchi et al. (2017). In this procedure, a network is expressed as a weighted combination of all the paths of the network (from the input layer to the output one), that we reformulate as a PAC-Bayesian majority vote. Starting from this observation, their learning procedure consists in (1) learning a "prior" network for fixing some parameters, then (2) learning a "posterior" network by only allowing a modification of the weights over the paths of the prior network. This allows us to derive a PAC-Bayesian generalization bound that involves the empirical individual risks of the paths (known as the Gibbs risk) and the empirical diversity between pairs of paths. Note that similarly to classical PAC-Bayesian bounds, our result involves a KL-divergence term between a "prior" network and the "posterior" network. We show that this term is computable by dynamic programming without assuming any distribution on the network weights.Lire moins >
Lire la suite >We propose a PAC-Bayesian theoretical study of the two-phase learning procedure of a neural network introduced by Kawaguchi et al. (2017). In this procedure, a network is expressed as a weighted combination of all the paths of the network (from the input layer to the output one), that we reformulate as a PAC-Bayesian majority vote. Starting from this observation, their learning procedure consists in (1) learning a "prior" network for fixing some parameters, then (2) learning a "posterior" network by only allowing a modification of the weights over the paths of the prior network. This allows us to derive a PAC-Bayesian generalization bound that involves the empirical individual risks of the paths (known as the Gibbs risk) and the empirical diversity between pairs of paths. Note that similarly to classical PAC-Bayesian bounds, our result involves a KL-divergence term between a "prior" network and the "posterior" network. We show that this term is computable by dynamic programming without assuming any distribution on the network weights.Lire moins >
Langue :
Anglais
Audience :
Internationale
Vulgarisation :
Non
Établissement(s) :
CNRS
Université de Lille
Université de Lille
Date de dépôt :
2020-06-08T14:10:29Z