Flattening a Hierarchical Clustering through ...
Type de document :
Communication dans un congrès avec actes
Titre :
Flattening a Hierarchical Clustering through Active Learning
Auteur(s) :
Vitale, Fabio [Auteur]
Machine Learning in Information Networks [MAGNET]
Rajagopalan, Anand [Auteur]
Google Inc
Gentile, Claudio [Auteur]
Google Inc
Machine Learning in Information Networks [MAGNET]
Rajagopalan, Anand [Auteur]
Google Inc
Gentile, Claudio [Auteur]
Google Inc
Titre de la manifestation scientifique :
Conference on Neural Information Processing Systems
Ville :
Vancouver
Pays :
Canada
Date de début de la manifestation scientifique :
2019-12-08
Discipline(s) HAL :
Informatique [cs]/Intelligence artificielle [cs.AI]
Résumé en anglais : [en]
We investigate active learning by pairwise similarity over the leaves of trees originating from hierarchical clustering procedures. In the realizable setting, we provide a full characterization of the number of queries ...
Lire la suite >We investigate active learning by pairwise similarity over the leaves of trees originating from hierarchical clustering procedures. In the realizable setting, we provide a full characterization of the number of queries needed to achieve perfect reconstruction of the tree cut. In the non-realizable setting, we rely on known important-sampling procedures to obtain regret and query complexity bounds. Our algorithms come with theoretical guarantees on the statistical error and, more importantly, lend themselves to linear-time implementations in the relevant parameters of the problem. We discuss such implementations, prove running time guarantees for them, and present preliminary experiments on real-world datasets showing the compelling practical performance of our algorithms as compared to both passive learning and simple active learning baselines.Lire moins >
Lire la suite >We investigate active learning by pairwise similarity over the leaves of trees originating from hierarchical clustering procedures. In the realizable setting, we provide a full characterization of the number of queries needed to achieve perfect reconstruction of the tree cut. In the non-realizable setting, we rely on known important-sampling procedures to obtain regret and query complexity bounds. Our algorithms come with theoretical guarantees on the statistical error and, more importantly, lend themselves to linear-time implementations in the relevant parameters of the problem. We discuss such implementations, prove running time guarantees for them, and present preliminary experiments on real-world datasets showing the compelling practical performance of our algorithms as compared to both passive learning and simple active learning baselines.Lire moins >
Langue :
Anglais
Comité de lecture :
Oui
Audience :
Internationale
Vulgarisation :
Non
Collections :
Source :
Fichiers
- https://hal.inria.fr/hal-02376981/document
- Accès libre
- Accéder au document
- https://hal.inria.fr/hal-02376981/document
- Accès libre
- Accéder au document
- https://hal.inria.fr/hal-02376981/document
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- 1906.09458.pdf
- Accès libre
- Accéder au document