GoldFinger: Fast & Approximate Jaccard for ...
Document type :
Compte-rendu et recension critique d'ouvrage
Title :
GoldFinger: Fast & Approximate Jaccard for Efficient KNN Graph Constructions
Author(s) :
Guerraoui, Rachid [Auteur]
Ecole Polytechnique Fédérale de Lausanne [EPFL]
Kermarrec, Anne-Marie [Auteur]
Ecole Polytechnique Fédérale de Lausanne [EPFL]
Niot, Guilhem [Auteur]
École normale supérieure de Lyon [ENS de Lyon]
Ruas, Olivier [Auteur]
Self-adaptation for distributed services and large software systems [SPIRALS]
Inria-EPFL [Inria-EPFL]
Taïani, François [Auteur]
the World Is Distributed Exploring the tension between scale and coordination [WIDE]
Ecole Polytechnique Fédérale de Lausanne [EPFL]
Kermarrec, Anne-Marie [Auteur]
Ecole Polytechnique Fédérale de Lausanne [EPFL]
Niot, Guilhem [Auteur]
École normale supérieure de Lyon [ENS de Lyon]
Ruas, Olivier [Auteur]
Self-adaptation for distributed services and large software systems [SPIRALS]
Inria-EPFL [Inria-EPFL]
Taïani, François [Auteur]
the World Is Distributed Exploring the tension between scale and coordination [WIDE]
Journal title :
IEEE Transactions on Knowledge and Data Engineering
Pages :
11461-11475
Publisher :
Institute of Electrical and Electronics Engineers
Publication date :
2023-11-01
ISSN :
1041-4347
English keyword(s) :
KNN graphs fingerprint similarity
KNN graphs
fingerprint
similarity
KNN graphs
fingerprint
similarity
HAL domain(s) :
Informatique [cs]
English abstract : [en]
We propose GoldFinger, a new compact and fast-to-compute binary representation of datasets to approximate Jaccard's index. We illustrate the effectiveness of GoldFinger on the emblematic big data problem of K-Nearest-Neighbor ...
Show more >We propose GoldFinger, a new compact and fast-to-compute binary representation of datasets to approximate Jaccard's index. We illustrate the effectiveness of GoldFinger on the emblematic big data problem of K-Nearest-Neighbor (KNN) graph construction and show that GoldFinger can drastically accelerate a large range of existing KNN algorithms with little to no overhead. As a side effect, we also show that the compact representation of the data protects users' privacy for free by providing k-anonymity and l-diversity. Our extensive evaluation of the resulting approach on several realistic datasets shows that our approach reduces computation times by up to 78.9% compared to raw data while only incurring a negligible to moderate loss in terms of KNN quality. We also show that GoldFinger can be applied to KNN queries (a widely-used search technique) and delivers speedups of up to ×3.55 over one of the most efficient approaches to this problem.Show less >
Show more >We propose GoldFinger, a new compact and fast-to-compute binary representation of datasets to approximate Jaccard's index. We illustrate the effectiveness of GoldFinger on the emblematic big data problem of K-Nearest-Neighbor (KNN) graph construction and show that GoldFinger can drastically accelerate a large range of existing KNN algorithms with little to no overhead. As a side effect, we also show that the compact representation of the data protects users' privacy for free by providing k-anonymity and l-diversity. Our extensive evaluation of the resulting approach on several realistic datasets shows that our approach reduces computation times by up to 78.9% compared to raw data while only incurring a negligible to moderate loss in terms of KNN quality. We also show that GoldFinger can be applied to KNN queries (a widely-used search technique) and delivers speedups of up to ×3.55 over one of the most efficient approaches to this problem.Show less >
Language :
Anglais
Popular science :
Non
Collections :
Source :
Files
- document
- Open access
- Access the document
- goldfinger_TKDE.pdf
- Open access
- Access the document
- document
- Open access
- Access the document
- goldfinger_TKDE.pdf
- Open access
- Access the document