supersamplerFractionnal hitting set ...
Type de document :
Autre communication scientifique (congrès sans actes - poster - séminaire...)
Titre :
supersamplerFractionnal hitting set implementation for lightweight genomic data sketching
Auteur(s) :
Limasset, Antoine [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Centre National de la Recherche Scientifique [CNRS]
Rouzé, Timothé [Auteur]
Martayan, Igor [Auteur]
Marchet, Camille [Auteur]

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Centre National de la Recherche Scientifique [CNRS]
Rouzé, Timothé [Auteur]
Martayan, Igor [Auteur]
Marchet, Camille [Auteur]
Mot(s)-clé(s) en anglais :
minimizers
indexing
kmers
indexing
kmers
Discipline(s) HAL :
Informatique [cs]/Bio-informatique [q-bio.QM]
Résumé en anglais : [en]
Bird-eye viewSuperSampler (SPSP) is an implementation for a novel k-mer selection scheme we called Fractional Hitting Sets (FHS) which is a generalisation of Universal Hitting Sets (UHS). It allows to quickly create sketches ...
Lire la suite >Bird-eye viewSuperSampler (SPSP) is an implementation for a novel k-mer selection scheme we called Fractional Hitting Sets (FHS) which is a generalisation of Universal Hitting Sets (UHS). It allows to quickly create sketches of genomes/ metagenomes and to compare such sketches to obtain Containment or Jaccard indices of the input data.SuperSampler uses super-k-mers instead of k-mers which allows for lighter sketches, less RAM usage and less computational time when performing comparison than traditional subsampling methods. Thanks to a clever sketch organisation allowed by the super-k-mers structure.Sketch creation is an application of FracMinHash on the selection of minimizers (a m-mer of a k-mer which hash value is minimal). When a minimizer is selected, every k-mer around it which shares the same minimizer is selected and will form a super-k-mer.Lire moins >
Lire la suite >Bird-eye viewSuperSampler (SPSP) is an implementation for a novel k-mer selection scheme we called Fractional Hitting Sets (FHS) which is a generalisation of Universal Hitting Sets (UHS). It allows to quickly create sketches of genomes/ metagenomes and to compare such sketches to obtain Containment or Jaccard indices of the input data.SuperSampler uses super-k-mers instead of k-mers which allows for lighter sketches, less RAM usage and less computational time when performing comparison than traditional subsampling methods. Thanks to a clever sketch organisation allowed by the super-k-mers structure.Sketch creation is an application of FracMinHash on the selection of minimizers (a m-mer of a k-mer which hash value is minimal). When a minimizer is selected, every k-mer around it which shares the same minimizer is selected and will form a super-k-mer.Lire moins >
Langue :
Anglais
Projet ANR :
Collections :
Source :
Fichiers
- document
- Accès libre
- Accéder au document
- supersampler-main.zip
- Accès libre
- Accéder au document