Statistical comparison in empirical computer science with minimal computation usage

Mathieu, Timothée; Preux, Philippe

Type de document :

Direction scientifique d'une publication (ouvrage, numéro spécial de revue, proceedings): Proceedings

DOI :

10.1145/3641525.3663618

Titre :

Statistical comparison in empirical computer science with minimal computation usage

Auteur(s) :

Mathieu, Timothée [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Scool [Scool]
Preux, Philippe [Auteur]

Titre de la manifestation scientifique :

ACM REP '24: ACM Conference on Reproducibility and Replicability

Éditeur :

ACM

Date de publication :

2024-06-18

Mot(s)-clé(s) en anglais :

• Mathematics of computing → Nonparametric statistics • General and reference → Empirical studies • Computing Statistical Reproducibility
Statistical Tests
Benchmarking ACM Reference
• Mathematics of computing → Nonparametric statistics
• General and reference → Empirical studies
• Computing Statistical Reproducibility

Discipline(s) HAL :

Statistiques [stat]/Autres [stat.ML]

Résumé en anglais : [en]

<div><p>The replicability of computational experiments remains a fundamental question. For example, the machine learning community has recently become aware of the poor replicability of many experimental studies that aim ...
Lire la suite ><div><p>The replicability of computational experiments remains a fundamental question. For example, the machine learning community has recently become aware of the poor replicability of many experimental studies that aim at comparing the performance of various algorithms. Due to computational costs, it is often necessary to use methods that require as few computations as possible to obtain a replicable conclusion. The conclusion of the comparison should also be replicable which calls for appropriate statistical tests. AdaStop is a recently introduced statistical test based on multiple group sequential tests. AdaStop adapts the number of executions of each experiment to stop as early as possible while ensuring that enough information is available to distinguish algorithms that perform better than the others in a statistically significant way. AdaStop has been initially exemplified on reinforcement learning tasks. In this short paper, we consider 3 case studies to investigate the use AdaStop beyond its original field of application, and demonstrate that it is a test that may be used on a wide range of application domains.</p></div> <div>CCS CONCEPTS</div>Lire moins >

Langue :

Anglais

Audience :

Internationale

Collections :

Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189

Source :

Harvested from HAL

Fichiers

document
Accès libre
Accéder au document

adastop_acm.pdf
Accès libre
Accéder au document

Statistical comparison in empirical computer ... BibTeX CSV Excel RIS

Fichiers

Statistical comparison in empirical computer ...

BibTeX

CSV

Excel

RIS