Kernel Approximation Methods for Speech ...
Type de document :
Compte-rendu et recension critique d'ouvrage
Titre :
Kernel Approximation Methods for Speech Recognition
Auteur(s) :
May, Avner [Auteur]
Columbia University [New York]
Bagheri Garakani, Alireza [Auteur]
University of Southern California [USC]
Lu, Zhiyun [Auteur]
University of Southern California [USC]
Guo, Dong [Auteur]
University of Southern California [USC]
Liu, Kuan [Auteur]
University of Southern California [USC]
Bellet, Aurelien [Auteur]
Machine Learning in Information Networks [MAGNET]
Fan, Linxi [Auteur]
Stanford University
Collins, Michael [Auteur]
Columbia University [New York]
Hsu, Daniel [Auteur]
Columbia University [New York]
Kingsbury, Brian [Auteur]
IBM Thomas J. Watson Research Center
Picheny, Michael [Auteur]
IBM Thomas J. Watson Research Center
Sha, Fei [Auteur]
University of Southern California [USC]
Columbia University [New York]
Bagheri Garakani, Alireza [Auteur]
University of Southern California [USC]
Lu, Zhiyun [Auteur]
University of Southern California [USC]
Guo, Dong [Auteur]
University of Southern California [USC]
Liu, Kuan [Auteur]
University of Southern California [USC]
Bellet, Aurelien [Auteur]
Machine Learning in Information Networks [MAGNET]
Fan, Linxi [Auteur]
Stanford University
Collins, Michael [Auteur]
Columbia University [New York]
Hsu, Daniel [Auteur]
Columbia University [New York]
Kingsbury, Brian [Auteur]
IBM Thomas J. Watson Research Center
Picheny, Michael [Auteur]
IBM Thomas J. Watson Research Center
Sha, Fei [Auteur]
University of Southern California [USC]
Titre de la revue :
Journal of Machine Learning Research
Pagination :
1 - 36
Éditeur :
Microtome Publishing
Date de publication :
2019
ISSN :
1532-4435
Mot(s)-clé(s) en anglais :
kernel methods
deep neural networks
acoustic modeling
automatic speech recognition
feature selection
deep neural networks
acoustic modeling
automatic speech recognition
feature selection
Discipline(s) HAL :
Informatique [cs]/Apprentissage [cs.LG]
Statistiques [stat]/Machine Learning [stat.ML]
Statistiques [stat]/Machine Learning [stat.ML]
Résumé en anglais : [en]
We study the performance of kernel methods on the acoustic modeling task for automatic speech recognition, and compare their performance to deep neural networks (DNNs). To scale the kernel methods to large data sets, we ...
Lire la suite >We study the performance of kernel methods on the acoustic modeling task for automatic speech recognition, and compare their performance to deep neural networks (DNNs). To scale the kernel methods to large data sets, we use the random Fourier feature method of Rahimi and Recht (2007). We propose two novel techniques for improving the performance of kernel acoustic models. First, we propose a simple but effective feature selection method which reduces the number of random features required to attain a fixed level of performance. Second, we present a number of metrics which correlate strongly with speech recognition performance when computed on the heldout set; we attain improved performance by using these metrics to decide when to stop training. Additionally, we show that the linear bottleneck method of Sainath et al. (2013a) improves the performance of our kernel models significantly, in addition to speeding up training and making the models more compact. Leveraging these three methods, the kernel methods attain token error rates between 0.5% better and 0.1% worse than fully-connected DNNs across four speech recognition data sets, including the TIMIT and Broadcast News benchmark tasks.Lire moins >
Lire la suite >We study the performance of kernel methods on the acoustic modeling task for automatic speech recognition, and compare their performance to deep neural networks (DNNs). To scale the kernel methods to large data sets, we use the random Fourier feature method of Rahimi and Recht (2007). We propose two novel techniques for improving the performance of kernel acoustic models. First, we propose a simple but effective feature selection method which reduces the number of random features required to attain a fixed level of performance. Second, we present a number of metrics which correlate strongly with speech recognition performance when computed on the heldout set; we attain improved performance by using these metrics to decide when to stop training. Additionally, we show that the linear bottleneck method of Sainath et al. (2013a) improves the performance of our kernel models significantly, in addition to speeding up training and making the models more compact. Leveraging these three methods, the kernel methods attain token error rates between 0.5% better and 0.1% worse than fully-connected DNNs across four speech recognition data sets, including the TIMIT and Broadcast News benchmark tasks.Lire moins >
Langue :
Anglais
Vulgarisation :
Non
Collections :
Source :
Fichiers
- https://hal.inria.fr/hal-02166422/document
- Accès libre
- Accéder au document
- https://hal.inria.fr/hal-02166422/document
- Accès libre
- Accéder au document
- https://hal.inria.fr/hal-02166422/document
- Accès libre
- Accéder au document
- https://hal.inria.fr/hal-02166422/document
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- jmlr19.pdf
- Accès libre
- Accéder au document