GoldFinger: Fast & Approximate Jaccard for ...
Type de document :
Compte-rendu et recension critique d'ouvrage
Titre :
GoldFinger: Fast & Approximate Jaccard for Efficient KNN Graph Constructions
Auteur(s) :
Guerraoui, Rachid [Auteur]
Ecole Polytechnique Fédérale de Lausanne [EPFL]
Kermarrec, Anne-Marie [Auteur]
Ecole Polytechnique Fédérale de Lausanne [EPFL]
Niot, Guilhem [Auteur]
École normale supérieure de Lyon [ENS de Lyon]
Ruas, Olivier [Auteur]
Self-adaptation for distributed services and large software systems [SPIRALS]
Inria-EPFL [Inria-EPFL]
Taïani, François [Auteur]
the World Is Distributed Exploring the tension between scale and coordination [WIDE]
Ecole Polytechnique Fédérale de Lausanne [EPFL]
Kermarrec, Anne-Marie [Auteur]
Ecole Polytechnique Fédérale de Lausanne [EPFL]
Niot, Guilhem [Auteur]
École normale supérieure de Lyon [ENS de Lyon]
Ruas, Olivier [Auteur]
Self-adaptation for distributed services and large software systems [SPIRALS]
Inria-EPFL [Inria-EPFL]
Taïani, François [Auteur]
the World Is Distributed Exploring the tension between scale and coordination [WIDE]
Titre de la revue :
IEEE Transactions on Knowledge and Data Engineering
Pagination :
11461-11475
Éditeur :
Institute of Electrical and Electronics Engineers
Date de publication :
2023-11-01
ISSN :
1041-4347
Mot(s)-clé(s) en anglais :
KNN graphs fingerprint similarity
KNN graphs
fingerprint
similarity
KNN graphs
fingerprint
similarity
Discipline(s) HAL :
Informatique [cs]
Résumé en anglais : [en]
We propose GoldFinger, a new compact and fast-to-compute binary representation of datasets to approximate Jaccard's index. We illustrate the effectiveness of GoldFinger on the emblematic big data problem of K-Nearest-Neighbor ...
Lire la suite >We propose GoldFinger, a new compact and fast-to-compute binary representation of datasets to approximate Jaccard's index. We illustrate the effectiveness of GoldFinger on the emblematic big data problem of K-Nearest-Neighbor (KNN) graph construction and show that GoldFinger can drastically accelerate a large range of existing KNN algorithms with little to no overhead. As a side effect, we also show that the compact representation of the data protects users' privacy for free by providing k-anonymity and l-diversity. Our extensive evaluation of the resulting approach on several realistic datasets shows that our approach reduces computation times by up to 78.9% compared to raw data while only incurring a negligible to moderate loss in terms of KNN quality. We also show that GoldFinger can be applied to KNN queries (a widely-used search technique) and delivers speedups of up to ×3.55 over one of the most efficient approaches to this problem.Lire moins >
Lire la suite >We propose GoldFinger, a new compact and fast-to-compute binary representation of datasets to approximate Jaccard's index. We illustrate the effectiveness of GoldFinger on the emblematic big data problem of K-Nearest-Neighbor (KNN) graph construction and show that GoldFinger can drastically accelerate a large range of existing KNN algorithms with little to no overhead. As a side effect, we also show that the compact representation of the data protects users' privacy for free by providing k-anonymity and l-diversity. Our extensive evaluation of the resulting approach on several realistic datasets shows that our approach reduces computation times by up to 78.9% compared to raw data while only incurring a negligible to moderate loss in terms of KNN quality. We also show that GoldFinger can be applied to KNN queries (a widely-used search technique) and delivers speedups of up to ×3.55 over one of the most efficient approaches to this problem.Lire moins >
Langue :
Anglais
Vulgarisation :
Non
Collections :
Source :
Fichiers
- document
- Accès libre
- Accéder au document
- goldfinger_TKDE.pdf
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- goldfinger_TKDE.pdf
- Accès libre
- Accéder au document