A Machine of Few Words Interactive Speaker ...
Type de document :
Communication dans un congrès avec actes
Titre :
A Machine of Few Words Interactive Speaker Recognition with Reinforcement Learning
Auteur(s) :
Seurin, Mathieu [Auteur]
Scool [Scool]
Sequential Learning [SEQUEL]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Strub, Florian [Auteur]
DeepMind [Paris]
Preux, Philippe [Auteur]
Scool [Scool]
Sequential Learning [SEQUEL]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Pietquin, Olivier [Auteur]
Google Research [Paris]
Scool [Scool]
Sequential Learning [SEQUEL]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Strub, Florian [Auteur]
DeepMind [Paris]
Preux, Philippe [Auteur]
Scool [Scool]
Sequential Learning [SEQUEL]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Pietquin, Olivier [Auteur]
Google Research [Paris]
Titre de la manifestation scientifique :
Conference of the International Speech Communication Association (INTERSPEECH)
Ville :
Shanghai
Pays :
Chine
Date de début de la manifestation scientifique :
2020-10-25
Titre de l’ouvrage :
Interspeech 2020 proceedings
Mot(s)-clé(s) en anglais :
active speaker recognition
reinforcement learning
deep learning
iterative representation learning
reinforcement learning
deep learning
iterative representation learning
Discipline(s) HAL :
Informatique [cs]
Informatique [cs]/Apprentissage [cs.LG]
Informatique [cs]/Intelligence artificielle [cs.AI]
Informatique [cs]/Traitement du signal et de l'image [eess.SP]
Informatique [cs]/Apprentissage [cs.LG]
Informatique [cs]/Intelligence artificielle [cs.AI]
Informatique [cs]/Traitement du signal et de l'image [eess.SP]
Résumé en anglais : [en]
Speaker recognition is a well known and studied task in the speech processing domain. It has many applications, either for security or speaker adaptation of personal devices. In this paper, we present a new paradigm for ...
Lire la suite >Speaker recognition is a well known and studied task in the speech processing domain. It has many applications, either for security or speaker adaptation of personal devices. In this paper, we present a new paradigm for automatic speaker recognition that we call Interactive Speaker Recognition (ISR). In this paradigm, the recognition system aims to incrementally build a representation of the speakers by requesting personalized utterances to be spoken in contrast to the standard text-dependent or text-independent schemes. To do so, we cast the speaker recognition task into a sequential decision-making problem that we solve with Reinforcement Learning. Using a standard dataset, we show that our method achieves excellent performance while using little speech signal amounts. This method could also be applied as an utterance selection mechanism for building speech synthesis systems.Lire moins >
Lire la suite >Speaker recognition is a well known and studied task in the speech processing domain. It has many applications, either for security or speaker adaptation of personal devices. In this paper, we present a new paradigm for automatic speaker recognition that we call Interactive Speaker Recognition (ISR). In this paradigm, the recognition system aims to incrementally build a representation of the speakers by requesting personalized utterances to be spoken in contrast to the standard text-dependent or text-independent schemes. To do so, we cast the speaker recognition task into a sequential decision-making problem that we solve with Reinforcement Learning. Using a standard dataset, we show that our method achieves excellent performance while using little speech signal amounts. This method could also be applied as an utterance selection mechanism for building speech synthesis systems.Lire moins >
Langue :
Anglais
Comité de lecture :
Oui
Audience :
Internationale
Vulgarisation :
Non
Collections :
Source :
Fichiers
- https://hal.archives-ouvertes.fr/hal-03123999/document
- Accès libre
- Accéder au document
- http://arxiv.org/pdf/2008.03127
- Accès libre
- Accéder au document
- https://hal.archives-ouvertes.fr/hal-03123999/document
- Accès libre
- Accéder au document
- https://hal.archives-ouvertes.fr/hal-03123999/document
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- Interspeech_2020.pdf
- Accès libre
- Accéder au document
- 2008.03127
- Accès libre
- Accéder au document