Spectral learning with proper probabilities ...
Document type :
Communication dans un congrès avec actes
Title :
Spectral learning with proper probabilities for finite state automation
Author(s) :
Glaude, Hadrien [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Sequential Learning [SEQUEL]
Thales Airborne Systems
Enderli, Cyrille [Auteur]
Thales Airborne Systems
Pietquin, Olivier [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Université de Lille, Sciences et Technologies
Institut universitaire de France [IUF]
Sequential Learning [SEQUEL]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Sequential Learning [SEQUEL]
Thales Airborne Systems
Enderli, Cyrille [Auteur]
Thales Airborne Systems
Pietquin, Olivier [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Université de Lille, Sciences et Technologies
Institut universitaire de France [IUF]
Sequential Learning [SEQUEL]
Conference title :
ASRU 2015 - Automatic Speech Recognition and Understanding Workshop
City :
Scottsdale
Country :
Etats-Unis d'Amérique
Start date of the conference :
2015-12-13
Journal title :
Proceedings of the Automatic Speech Recognition and Understanding Workshop
Publisher :
IEEE
English keyword(s) :
spectral learning
Baum-welch
learning automata
non-negative matrix factorization
language models
Baum-welch
learning automata
non-negative matrix factorization
language models
HAL domain(s) :
Informatique [cs]/Apprentissage [cs.LG]
Informatique [cs]/Interface homme-machine [cs.HC]
Informatique [cs]/Interface homme-machine [cs.HC]
English abstract : [en]
Probabilistic Finite Automaton (PFA), Probabilistic Finite State Transducers (PFST) and Hidden Markov Models (HMM) are widely used in Automatic Speech Recognition (ASR), Text-to-Speech (TTS) systems and Part Of Speech (POS) ...
Show more >Probabilistic Finite Automaton (PFA), Probabilistic Finite State Transducers (PFST) and Hidden Markov Models (HMM) are widely used in Automatic Speech Recognition (ASR), Text-to-Speech (TTS) systems and Part Of Speech (POS) tagging for language mod-eling. Traditionally, unsupervised learning of these latent variable models is done by Expectation-Maximization (EM)-like algorithms, as the Baum-Welch algorithm. In a recent alternative line of work, learning algorithms based on spectral properties of some low order moments matrices or tensors were proposed. In comparison to EM, they are orders of magnitude faster and come with theoretical convergence guarantees. However, returned models are not ensured to compute proper distributions. They often return negative values that do not sum to one, limiting their applicability and preventing them to serve as an initialization to EM-like algorithms. In this paper, we propose a new spectral algorithm able to learn a large range of models constrained to return proper distributions. We assess its performances on synthetic problems from the PAutomaC challenge and real datasets extracted from Wikipedia. Experiments show that it outperforms previous spectral approaches as well as the Baum-Welch algorithm with random restarts, in addition to serve as an efficient initialization step to EM-like algorithms.Show less >
Show more >Probabilistic Finite Automaton (PFA), Probabilistic Finite State Transducers (PFST) and Hidden Markov Models (HMM) are widely used in Automatic Speech Recognition (ASR), Text-to-Speech (TTS) systems and Part Of Speech (POS) tagging for language mod-eling. Traditionally, unsupervised learning of these latent variable models is done by Expectation-Maximization (EM)-like algorithms, as the Baum-Welch algorithm. In a recent alternative line of work, learning algorithms based on spectral properties of some low order moments matrices or tensors were proposed. In comparison to EM, they are orders of magnitude faster and come with theoretical convergence guarantees. However, returned models are not ensured to compute proper distributions. They often return negative values that do not sum to one, limiting their applicability and preventing them to serve as an initialization to EM-like algorithms. In this paper, we propose a new spectral algorithm able to learn a large range of models constrained to return proper distributions. We assess its performances on synthetic problems from the PAutomaC challenge and real datasets extracted from Wikipedia. Experiments show that it outperforms previous spectral approaches as well as the Baum-Welch algorithm with random restarts, in addition to serve as an efficient initialization step to EM-like algorithms.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
Collections :
Source :
Files
- https://hal.inria.fr/hal-01225810/document
- Open access
- Access the document
- https://hal.inria.fr/hal-01225810/document
- Open access
- Access the document
- https://hal.inria.fr/hal-01225810/document
- Open access
- Access the document
- document
- Open access
- Access the document
- ASRU_2015_HGCEOP.pdf
- Open access
- Access the document
- document
- Open access
- Access the document
- ASRU_2015_HGCEOP.pdf
- Open access
- Access the document