Leveraging Data Geometry to Mitigate CSM ...
Document type :
Autre communication scientifique (congrès sans actes - poster - séminaire...): Communication dans un congrès avec actes
Title :
Leveraging Data Geometry to Mitigate CSM in Steganalysis
Author(s) :
Abecidan, Rony [Auteur]
Centre National de la Recherche Scientifique [CNRS]
Université de Lille
Centrale Lille
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Itier, Vincent [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Ecole nationale supérieure Mines-Télécom Lille Douai [IMT Nord Europe]
Boulanger, Jérémie [Auteur]
Université de Lille
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Bas, Patrick [Auteur]
Centrale Lille
Centre National de la Recherche Scientifique [CNRS]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Pevný, Tomáš [Auteur]
Czech Technical University in Prague [CTU]
Centre National de la Recherche Scientifique [CNRS]
Université de Lille
Centrale Lille
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Itier, Vincent [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Ecole nationale supérieure Mines-Télécom Lille Douai [IMT Nord Europe]
Boulanger, Jérémie [Auteur]
Université de Lille
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Bas, Patrick [Auteur]

Centrale Lille
Centre National de la Recherche Scientifique [CNRS]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Pevný, Tomáš [Auteur]
Czech Technical University in Prague [CTU]
Conference title :
IEEE International Workshop on Information Forensics and Security (WIFS 2023)
City :
Nuremberg
Country :
Allemagne
Start date of the conference :
2023-12-04
Publication date :
2023-12-04
English keyword(s) :
steganalysis
steganography
forensics
cover source mismatch
domain generalization
machine learning
domain adaptation
data adaptation
steganography
forensics
cover source mismatch
domain generalization
machine learning
domain adaptation
data adaptation
HAL domain(s) :
Informatique [cs]/Traitement du signal et de l'image [eess.SP]
Informatique [cs]/Intelligence artificielle [cs.AI]
Informatique [cs]/Cryptographie et sécurité [cs.CR]
Informatique [cs]/Vision par ordinateur et reconnaissance de formes [cs.CV]
Informatique [cs]/Multimédia [cs.MM]
Statistiques [stat]/Machine Learning [stat.ML]
Informatique [cs]/Intelligence artificielle [cs.AI]
Informatique [cs]/Cryptographie et sécurité [cs.CR]
Informatique [cs]/Vision par ordinateur et reconnaissance de formes [cs.CV]
Informatique [cs]/Multimédia [cs.MM]
Statistiques [stat]/Machine Learning [stat.ML]
English abstract : [en]
In operational scenarios, steganographers use sets of covers from various sensors and processing pipelines that differ significantly from those used by researchers to train steganalysis models. This leads to an inevitable ...
Show more >In operational scenarios, steganographers use sets of covers from various sensors and processing pipelines that differ significantly from those used by researchers to train steganalysis models. This leads to an inevitable performance gap when dealing with out-of-distribution covers, commonly referred to as Cover Source Mismatch (CSM). In this study, we consider the scenario where test images are processed using the same pipeline. However, knowledge regarding both the labels and the balance between cover and stego is missing. Our objective is to identify a training dataset that allows for maximum generalization to our target. By exploring a grid of processing pipelines fostering CSM, we discovered a geometrical metric based on the chordal distance between subspaces spanned by DCTr features, that exhibits high correlation with operational regret while being not affected by the cover-stego balance. Our contribution lies in the development of a strategy that enables the selection or derivation of customized training datasets, enhancing the overall generalization performance for a given target. Experimental validation highlights that our geometry-based optimization strategy outperforms traditional atomistic methods given reasonable assumptions. Additional resources are available at github.com/RonyAbecidan/LeveragingGeometrytoMitigateCSM.Show less >
Show more >In operational scenarios, steganographers use sets of covers from various sensors and processing pipelines that differ significantly from those used by researchers to train steganalysis models. This leads to an inevitable performance gap when dealing with out-of-distribution covers, commonly referred to as Cover Source Mismatch (CSM). In this study, we consider the scenario where test images are processed using the same pipeline. However, knowledge regarding both the labels and the balance between cover and stego is missing. Our objective is to identify a training dataset that allows for maximum generalization to our target. By exploring a grid of processing pipelines fostering CSM, we discovered a geometrical metric based on the chordal distance between subspaces spanned by DCTr features, that exhibits high correlation with operational regret while being not affected by the cover-stego balance. Our contribution lies in the development of a strategy that enables the selection or derivation of customized training datasets, enhancing the overall generalization performance for a given target. Experimental validation highlights that our geometry-based optimization strategy outperforms traditional atomistic methods given reasonable assumptions. Additional resources are available at github.com/RonyAbecidan/LeveragingGeometrytoMitigateCSM.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
Collections :
Source :
Files
- document
- Open access
- Access the document
- 2023_wifs.pdf
- Open access
- Access the document
- 2310.04479
- Open access
- Access the document