Language Recognition using phonotactic-based Shifted Delta Coefficients and multiple phone recognizers

Luis Fernando D'Haro, Ricardo Cordoba, Christian Salamea, Javier Ferreiros

Resultado de la investigación: Contribución a una revistaArtículo de la conferenciarevisión exhaustiva

4 Citas (Scopus)

Resumen

A new language recognition technique based on the application of the philosophy of the Shifted Delta Coefficients (SDC) to phone log-likelihood ratio features (PLLR) is described. The new methodology allows the incorporation of long-span phonetic information at a frame-by-frame level while dealing with the temporal length of each phone unit. The proposed features are used to train an i-vector based system and tested on the Albayzin LRE 2012 dataset. The results show a relative improvement of 33.3% in Cavg in comparison with different state-of-the-art acoustic i-vector based systems. On the other hand, the integration of parallel phone ASR systems where each one is used to generate multiple PLLR coefficients which are stacked together and then projected into a reduced dimension are also presented. Finally, the paper shows how the incorporation of state information from the phone ASR contributes to provide additional improvements and how the fusion with the other acoustic and phonotactic systems provides an important improvement of 25.8% over the system presented during the competition.

Idioma originalInglés
Páginas (desde-hasta)3042-3046
Número de páginas5
PublicaciónProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
EstadoPublicada - 2014
Evento15th Annual Conference of the International Speech Communication Association: Celebrating the Diversity of Spoken Languages, INTERSPEECH 2014 - Singapore, Singapur
Duración: 14 sep. 201418 sep. 2014

Nota bibliográfica

Publisher Copyright:
Copyright © 2014 ISCA.

Huella

Profundice en los temas de investigación de 'Language Recognition using phonotactic-based Shifted Delta Coefficients and multiple phone recognizers'. En conjunto forman una huella única.

Citar esto