Convolutional Time Delay Neural Network for Khmer Automatic Speech Recognition - Laboratoire d'Informatique de Grenoble Accéder directement au contenu
Communication Dans Un Congrès (Preprint/Prepublication) Année : 2022

Convolutional Time Delay Neural Network for Khmer Automatic Speech Recognition

Résumé

Convolutional Neural Networks have been proven to successfully capture spatial aspects of the speech signal and eliminate spectral variations across speakers for Automatic Speech Recognition. In this study, we investigate the Convolutional Neural Network with Time Delay Neural Network for an acoustic model to deal with large vocabulary continuous speech recognition for Khmer. Our idea is to use Convolutional Neural Networks to extract local features of the speech signal, whereas Time Delay Neural Networks capture long temporal correlations between acoustic events. The experimental results show that the suggested network outperforms the Time Delay Neural Network and achieves an average relative improvement of 14% across test sets.
Fichier principal
Vignette du fichier
main.pdf (304.53 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03865538 , version 1 (22-11-2022)

Identifiants

  • HAL Id : hal-03865538 , version 1

Citer

Nalin Srun, Sotheara Leang, Ye Kyaw, Sethserey Sam. Convolutional Time Delay Neural Network for Khmer Automatic Speech Recognition. iSAI-NLP-AIoT 2022, Nov 2022, Chiang Mai, Thailand. ⟨hal-03865538⟩
84 Consultations
145 Téléchargements

Partager

Gmail Facebook X LinkedIn More