Cluster-and-Conquer: When Randomness Meets ...
Document type :
Communication dans un congrès avec actes
Title :
Cluster-and-Conquer: When Randomness Meets Graph Locality
Author(s) :
Giakkoupis, George [Auteur]
the World Is Distributed Exploring the tension between scale and coordination [WIDE]
Kermarrec, Anne-Marie [Auteur]
Ecole Polytechnique Fédérale de Lausanne [EPFL]
Ruas, Olivier [Auteur]
Self-adaptation for distributed services and large software systems [SPIRALS]
Taïani, François [Auteur]
the World Is Distributed Exploring the tension between scale and coordination [WIDE]
the World Is Distributed Exploring the tension between scale and coordination [WIDE]
Kermarrec, Anne-Marie [Auteur]
Ecole Polytechnique Fédérale de Lausanne [EPFL]
Ruas, Olivier [Auteur]
Self-adaptation for distributed services and large software systems [SPIRALS]
Taïani, François [Auteur]
the World Is Distributed Exploring the tension between scale and coordination [WIDE]
Conference title :
ICDE 2021 - IEEE 37th International Conference on Data Engineering
City :
Chania
Country :
Grèce
Start date of the conference :
2021-04-19
Publisher :
IEEE
HAL domain(s) :
Informatique [cs]/Apprentissage [cs.LG]
Informatique [cs]/Algorithme et structure de données [cs.DS]
Informatique [cs]/Calcul parallèle, distribué et partagé [cs.DC]
Informatique [cs]/Base de données [cs.DB]
Informatique [cs]/Algorithme et structure de données [cs.DS]
Informatique [cs]/Calcul parallèle, distribué et partagé [cs.DC]
Informatique [cs]/Base de données [cs.DB]
English abstract : [en]
K-Nearest-Neighbors (KNN) graphs are central to many emblematic data mining and machine-learning applications. Some of the most efficient KNN graph algorithms are incremental and local: they start from a random graph, which ...
Show more >K-Nearest-Neighbors (KNN) graphs are central to many emblematic data mining and machine-learning applications. Some of the most efficient KNN graph algorithms are incremental and local: they start from a random graph, which they incrementally improve by traversing neighbors-of-neighbors links. Unfortunately, the initial random graph exhibits a poor graph locality, leading to many unnecessary similarity computations. In this paper, we remove this drawback with Clusterand-Conquer (C 2 for short). Cluster-and-Conquer boosts the starting configuration of greedy algorithms thanks to a novel lightweight clustering mechanism, dubbed FastRandomHash. FastRandomHash leverages randomness and recursion to precluster similar nodes at a very low cost. Our extensive evaluation on real datasets shows that Cluster-and-Conquer significantly outperforms existing approaches, including LSH, yielding speedups of up to ×4.42 and even improving the KNN quality.Show less >
Show more >K-Nearest-Neighbors (KNN) graphs are central to many emblematic data mining and machine-learning applications. Some of the most efficient KNN graph algorithms are incremental and local: they start from a random graph, which they incrementally improve by traversing neighbors-of-neighbors links. Unfortunately, the initial random graph exhibits a poor graph locality, leading to many unnecessary similarity computations. In this paper, we remove this drawback with Clusterand-Conquer (C 2 for short). Cluster-and-Conquer boosts the starting configuration of greedy algorithms thanks to a novel lightweight clustering mechanism, dubbed FastRandomHash. FastRandomHash leverages randomness and recursion to precluster similar nodes at a very low cost. Our extensive evaluation on real datasets shows that Cluster-and-Conquer significantly outperforms existing approaches, including LSH, yielding speedups of up to ×4.42 and even improving the KNN quality.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
Collections :
Source :
Files
- https://hal.inria.fr/hal-03346860/document
- Open access
- Access the document
- http://arxiv.org/pdf/2010.11497
- Open access
- Access the document
- https://hal.inria.fr/hal-03346860/document
- Open access
- Access the document
- https://hal.inria.fr/hal-03346860/document
- Open access
- Access the document
- document
- Open access
- Access the document
- ClusterAndConquer_short_ICDE.pdf
- Open access
- Access the document
- 2010.11497
- Open access
- Access the document
- document
- Open access
- Access the document
- ClusterAndConquer_short_ICDE.pdf
- Open access
- Access the document