FP-Crawlers: Studying the Resilience of ...
Document type :
Communication dans un congrès avec actes
DOI :
Title :
FP-Crawlers: Studying the Resilience of Browser Fingerprinting to Block Crawlers
Author(s) :
Vastel, Antoine [Auteur]
Self-adaptation for distributed services and large software systems [SPIRALS]
Rudametkin, Walter [Auteur]
Self-adaptation for distributed services and large software systems [SPIRALS]
Rouvoy, Romain [Auteur]
Self-adaptation for distributed services and large software systems [SPIRALS]
Institut Universitaire de France [IUF]
Blanc, Xavier [Auteur]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Institut Universitaire de France [IUF]
Self-adaptation for distributed services and large software systems [SPIRALS]
Rudametkin, Walter [Auteur]

Self-adaptation for distributed services and large software systems [SPIRALS]
Rouvoy, Romain [Auteur]

Self-adaptation for distributed services and large software systems [SPIRALS]
Institut Universitaire de France [IUF]
Blanc, Xavier [Auteur]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Institut Universitaire de France [IUF]
Scientific editor(s) :
Oleksii Starov
Alexandros Kapravelos
Nick Nikiforakis
Alexandros Kapravelos
Nick Nikiforakis
Conference title :
MADWeb'20 - NDSS Workshop on Measurements, Attacks, and Defenses for the Web
City :
San Diego
Country :
Etats-Unis d'Amérique
Start date of the conference :
2020-02-23
English keyword(s) :
Detection of bots
browser fingerprinting
browser fingerprinting
HAL domain(s) :
Informatique [cs]/Cryptographie et sécurité [cs.CR]
Informatique [cs]/Web
Informatique [cs]/Web
English abstract : [en]
Data available on the Web, such as financial data or public reviews, provides a competitive advantage to companies able to exploit them. Web crawlers, a category of bot, aim at automating the collection of publicly available ...
Show more >Data available on the Web, such as financial data or public reviews, provides a competitive advantage to companies able to exploit them. Web crawlers, a category of bot, aim at automating the collection of publicly available Web data. While some crawlers collect data with the agreement of the websites being crawled, most crawlers do not respect the terms of service. CAPTCHAs and approaches based on analyzing series of HTTP requests classify users as humans or bots. However, these approaches require either user interaction or a significant volume of data before they can classify the traffic.In this paper, we study browser fingerprinting as a crawler detection mechanism. We crawled the Alexa top 10K and identified 291 websites that block crawlers. We show that fingerprinting is used by 93 (31.96%) of them and we report on the crawler detection techniques implemented by the major fingerprinters. Finally, we evaluate the resilience of fingerprinting against crawlers trying to conceal themselves. We show that although fingerprinting is good at detecting crawlers, it can be bypassed with little effort by an adversary with knowledge on the fingerprints collected.Show less >
Show more >Data available on the Web, such as financial data or public reviews, provides a competitive advantage to companies able to exploit them. Web crawlers, a category of bot, aim at automating the collection of publicly available Web data. While some crawlers collect data with the agreement of the websites being crawled, most crawlers do not respect the terms of service. CAPTCHAs and approaches based on analyzing series of HTTP requests classify users as humans or bots. However, these approaches require either user interaction or a significant volume of data before they can classify the traffic.In this paper, we study browser fingerprinting as a crawler detection mechanism. We crawled the Alexa top 10K and identified 291 websites that block crawlers. We show that fingerprinting is used by 93 (31.96%) of them and we report on the crawler detection techniques implemented by the major fingerprinters. Finally, we evaluate the resilience of fingerprinting against crawlers trying to conceal themselves. We show that although fingerprinting is good at detecting crawlers, it can be bypassed with little effort by an adversary with knowledge on the fingerprints collected.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
ANR Project :
Collections :
Source :
Files
- https://hal.inria.fr/hal-02441653/document
- Open access
- Access the document
- https://hal.inria.fr/hal-02441653/document
- Open access
- Access the document
- https://hal.inria.fr/hal-02441653/document
- Open access
- Access the document
- document
- Open access
- Access the document
- vastel-madweb20.pdf
- Open access
- Access the document