Flattening a Hierarchical Clustering through ...
Document type :
Communication dans un congrès avec actes
Title :
Flattening a Hierarchical Clustering through Active Learning
Author(s) :
Vitale, Fabio [Auteur]
Machine Learning in Information Networks [MAGNET]
Rajagopalan, Anand [Auteur]
Google Inc
Gentile, Claudio [Auteur]
Google Inc
Machine Learning in Information Networks [MAGNET]
Rajagopalan, Anand [Auteur]
Google Inc
Gentile, Claudio [Auteur]
Google Inc
Conference title :
Conference on Neural Information Processing Systems
City :
Vancouver
Country :
Canada
Start date of the conference :
2019-12-08
HAL domain(s) :
Informatique [cs]/Intelligence artificielle [cs.AI]
English abstract : [en]
We investigate active learning by pairwise similarity over the leaves of trees originating from hierarchical clustering procedures. In the realizable setting, we provide a full characterization of the number of queries ...
Show more >We investigate active learning by pairwise similarity over the leaves of trees originating from hierarchical clustering procedures. In the realizable setting, we provide a full characterization of the number of queries needed to achieve perfect reconstruction of the tree cut. In the non-realizable setting, we rely on known important-sampling procedures to obtain regret and query complexity bounds. Our algorithms come with theoretical guarantees on the statistical error and, more importantly, lend themselves to linear-time implementations in the relevant parameters of the problem. We discuss such implementations, prove running time guarantees for them, and present preliminary experiments on real-world datasets showing the compelling practical performance of our algorithms as compared to both passive learning and simple active learning baselines.Show less >
Show more >We investigate active learning by pairwise similarity over the leaves of trees originating from hierarchical clustering procedures. In the realizable setting, we provide a full characterization of the number of queries needed to achieve perfect reconstruction of the tree cut. In the non-realizable setting, we rely on known important-sampling procedures to obtain regret and query complexity bounds. Our algorithms come with theoretical guarantees on the statistical error and, more importantly, lend themselves to linear-time implementations in the relevant parameters of the problem. We discuss such implementations, prove running time guarantees for them, and present preliminary experiments on real-world datasets showing the compelling practical performance of our algorithms as compared to both passive learning and simple active learning baselines.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
Collections :
Source :
Files
- https://hal.inria.fr/hal-02376981/document
- Open access
- Access the document
- https://hal.inria.fr/hal-02376981/document
- Open access
- Access the document
- https://hal.inria.fr/hal-02376981/document
- Open access
- Access the document
- document
- Open access
- Access the document
- 1906.09458.pdf
- Open access
- Access the document