• English
    • français
  • Help
  •  | 
  • Contact
  •  | 
  • About
  •  | 
  • Login
  • HAL portal
  •  | 
  • Pages Pro
  • EN
  •  / 
  • FR
View Item 
  •   LillOA Home
  • Liste des unités
  • Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
  • View Item
  •   LillOA Home
  • Liste des unités
  • Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Study on Acoustic Model Personalization ...
  • BibTeX
  • CSV
  • Excel
  • RIS

Document type :
Communication dans un congrès avec actes
DOI :
10.1007/978-3-030-87802-3_39
Title :
Study on Acoustic Model Personalization in a Context of Collaborative Learning Constrained by Privacy Preservation
Author(s) :
Mdhaffar, Salima [Auteur]
Laboratoire Informatique d'Avignon [LIA]
Tommasi, Marc [Auteur]
Machine Learning in Information Networks [MAGNET]
Estève, Yannick [Auteur]
Laboratoire Informatique d'Avignon [LIA]
Conference title :
SPECOM 2021 - 23rd International Conference on Speech and Computer
City :
St Petersburg
Country :
Russie
Start date of the conference :
2021-09-27
Book title :
Speech and Computer 23rd International Conference, SPECOM 2021, St. Petersburg, Russia, September 27–30, 2021, Proceedings
Publication date :
2021
English keyword(s) :
Automatic speech recognition
Privacy-protection
Collaborative learning
Acoustic models
Personalization
HAL domain(s) :
Informatique [cs]/Informatique et langage [cs.CL]
English abstract : [en]
This paper investigates different approaches in order to improve the performance of a speech recognition system for a given speaker by using no more than 5 min of speech from this speaker, and without exchanging data from ...
Show more >
This paper investigates different approaches in order to improve the performance of a speech recognition system for a given speaker by using no more than 5 min of speech from this speaker, and without exchanging data from other users/speakers. Inspired by the federated learning paradigm, we consider speakers that have access to a personalized database of their own speech, learn an acoustic model and collaborate with other speakers in a network to improve their model. Several local personalizations are explored depending on how aggregation mechanisms are performed. We study the impact of selecting, in an adaptive way, a subset of speakers's models based on a notion of similarity. We also investigate the effect of weighted averaging of fine-tuned and global models. In our approach, only neural acoustic model parameters are exchanged and no audio data is exchanged. By avoiding communicating their personal data, the proposed approach tends to preserve the privacy of speakers. Experiments conducted on the TEDLIUM 3 dataset show that the best improvement is given by averaging a subset of different acoustic models fine-tuned on several user datasets. Our approach applied to HMM/TDNN acoustic models improves quickly and significantly the ASR performance in terms of WER (for instance in one of our two evaluation datasets, from 14.84% to 13.45% with less than 5 min of speech per speaker).Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
ANR Project :
Apprentissage distribué, personnalisé, préservant la privacité pour le traitement de la parole
Collections :
  • Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
Source :
Harvested from HAL
Files
Thumbnail
  • https://hal.archives-ouvertes.fr/hal-03369206/document
  • Open access
  • Access the document
Thumbnail
  • https://hal.archives-ouvertes.fr/hal-03369206/document
  • Open access
  • Access the document
Thumbnail
  • https://hal.archives-ouvertes.fr/hal-03369206/document
  • Open access
  • Access the document
Université de Lille

Mentions légales
Université de Lille © 2017