Abstract
This paper explores a better way to learn word vector representations for language identification (LID). We have focused on a phonotactic approach using phoneme sequences in order to make phonotactic units (phone-grams) to incorporate context information. In order to take into consideration the morphology of phone-grams, we have considered the use of sub-word information (lower-order n-grams) to learn phone-grams embeddings using FastText. These embeddings are used as input to an i-Vector framework to train a multiclass logistic classifier. Our approach has been compared with a LID system that uses phone-gram embeddings learned through Skipgram that do not implement sub-word information, using Cavg as a metric for our experiments. Our approach to LID to incorporate sub-word information in phone-grams embeddings significantly improves the results obtained by using embeddings that are learned ignoring the structure of phone-grams. Furthermore, we have shown that our system provides complementary information to an acoustic system, improving it through the fusion of both systems.
| Original language | English |
|---|---|
| Title of host publication | Conversational Dialogue Systems for the Next Decade, IWSDS 2020 |
| Editors | Luis Fernando D’Haro, Zoraida Callejas, Satoshi Nakamura |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 339-348 |
| Number of pages | 10 |
| ISBN (Print) | 9789811583940 |
| DOIs | |
| State | Published - 2021 |
| Event | 11th International Workshop on Spoken Dialogue Systems, IWSDS 2020 - Madrid, Spain Duration: 21 Sep 2020 → 23 Sep 2020 |
Publication series
| Name | Lecture Notes in Electrical Engineering |
|---|---|
| Volume | 704 |
Conference
| Conference | 11th International Workshop on Spoken Dialogue Systems, IWSDS 2020 |
|---|---|
| Country/Territory | Spain |
| City | Madrid |
| Period | 21/09/20 → 23/09/20 |
Bibliographical note
Publisher Copyright:© 2021, The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
CACES Knowledge Areas
- 316A Software and Applications Development and Analysis
Projects
- 1 Finished
-
Development and Evaluation of an Intelligent Support System Based on Signal Processing Algorithms for the Evaluation of Patients with Vitiligo
Calle Ortiz, E. R. (PI), Chica Ortiz, J. F. (Assistant), Salamea Palacios, C. R. (Col), Arias Salcedo, K. A. (Student), Auquilla Vicuña, J. F. (Student), Mora Alvarez, J. C. (Student), Zumba Narvaez, F. P. (Student) & Zumba Narvaez, E. A. (Student)
15/06/17 → 22/11/22
Project: Research and Development
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver