Extended phone log-likelihood ratio features and acoustic-based i-vectors for language recognition

L. F. D'Haro, R. Cordoba, C. Salamea, J. D. Echeverry

Producción científica: Capítulo del libro/informe/acta de congresoContribución de conferenciarevisión exhaustiva

32 Citas (Scopus)

Resumen

This paper presents new techniques with relevant improvements added to the primary system presented by our group to the Albayzin 2012 LRE competition, where the use of any additional corpora for training or optimizing the models was forbidden. In this work, we present the incorporation of an additional phonotactic subsystem based on the use of phone log-likelihood ratio features (PLLR) extracted from different phonotactic recognizers that contributes to improve the accuracy of the system in a 21.4% in terms of Cavg (we also present results for the official metric during the evaluation, Fact). We will present how using these features at the phone state level provides significant improvements, when used together with dimensionality reduction techniques, especially PCA. We have also experimented with applying alternative SDC-like configurations on these PLLR features with additional improvements. Also, we will describe some modifications to the MFCC-based acoustic i-vector system which have also contributed to additional improvements. The final fused system outperformed the baseline in 27.4% in Cavg.

Idioma originalInglés
Título de la publicación alojada2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
EditorialInstitute of Electrical and Electronics Engineers Inc.
Páginas5342-5346
Número de páginas5
ISBN (versión impresa)9781479928927
DOI
EstadoPublicada - 2014
Evento2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 - Florence, Italia
Duración: 4 may. 20149 may. 2014

Serie de la publicación

NombreICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (versión impresa)1520-6149

Conferencia

Conferencia2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
País/TerritorioItalia
CiudadFlorence
Período4/05/149/05/14

Huella

Profundice en los temas de investigación de 'Extended phone log-likelihood ratio features and acoustic-based i-vectors for language recognition'. En conjunto forman una huella única.

Citar esto