On the Use of Phonotactic Vector Representations with FastText for Language Identification

David Romero; Christian Salamea

doi:10.1007/978-981-15-8395-7_25

On the Use of Phonotactic Vector Representations with FastText for Language Identification

David Romero, Christian Salamea

Research Group on Interaction, Robotics and Automatics (GIIRA)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

1 Scopus citations

Abstract

This paper explores a better way to learn word vector representations for language identification (LID). We have focused on a phonotactic approach using phoneme sequences in order to make phonotactic units (phone-grams) to incorporate context information. In order to take into consideration the morphology of phone-grams, we have considered the use of sub-word information (lower-order n-grams) to learn phone-grams embeddings using FastText. These embeddings are used as input to an i-Vector framework to train a multiclass logistic classifier. Our approach has been compared with a LID system that uses phone-gram embeddings learned through Skipgram that do not implement sub-word information, using Cavg as a metric for our experiments. Our approach to LID to incorporate sub-word information in phone-grams embeddings significantly improves the results obtained by using embeddings that are learned ignoring the structure of phone-grams. Furthermore, we have shown that our system provides complementary information to an acoustic system, improving it through the fusion of both systems.

Original language	English
Title of host publication	Conversational Dialogue Systems for the Next Decade, IWSDS 2020
Editors	Luis Fernando D’Haro, Zoraida Callejas, Satoshi Nakamura
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	339-348
Number of pages	10
ISBN (Print)	9789811583940
DOIs	https://doi.org/10.1007/978-981-15-8395-7_25
State	Published - 2021
Event	11th International Workshop on Spoken Dialogue Systems, IWSDS 2020 - Madrid, Spain Duration: 21 Sep 2020 → 23 Sep 2020

Publication series

Name	Lecture Notes in Electrical Engineering
Volume	704
ISSN (Print)	1876-1100
ISSN (Electronic)	1876-1119

Conference

Conference	11th International Workshop on Spoken Dialogue Systems, IWSDS 2020
Country/Territory	Spain
City	Madrid
Period	21/09/20 → 23/09/20

Bibliographical note

Publisher Copyright:
© 2021, The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.

Access to Document

10.1007/978-981-15-8395-7_25

Cite this

Romero, D., & Salamea, C. (2021). On the Use of Phonotactic Vector Representations with FastText for Language Identification. In L. F. D’Haro, Z. Callejas, & S. Nakamura (Eds.), Conversational Dialogue Systems for the Next Decade, IWSDS 2020 (pp. 339-348). (Lecture Notes in Electrical Engineering; Vol. 704). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-15-8395-7_25

Romero, David ; Salamea, Christian. / On the Use of Phonotactic Vector Representations with FastText for Language Identification. Conversational Dialogue Systems for the Next Decade, IWSDS 2020. editor / Luis Fernando D’Haro ; Zoraida Callejas ; Satoshi Nakamura. Springer Science and Business Media Deutschland GmbH, 2021. pp. 339-348 (Lecture Notes in Electrical Engineering).

@inproceedings{b12015bc18d043a89260768c2de0e65c,

title = "On the Use of Phonotactic Vector Representations with FastText for Language Identification",

abstract = "This paper explores a better way to learn word vector representations for language identification (LID). We have focused on a phonotactic approach using phoneme sequences in order to make phonotactic units (phone-grams) to incorporate context information. In order to take into consideration the morphology of phone-grams, we have considered the use of sub-word information (lower-order n-grams) to learn phone-grams embeddings using FastText. These embeddings are used as input to an i-Vector framework to train a multiclass logistic classifier. Our approach has been compared with a LID system that uses phone-gram embeddings learned through Skipgram that do not implement sub-word information, using Cavg as a metric for our experiments. Our approach to LID to incorporate sub-word information in phone-grams embeddings significantly improves the results obtained by using embeddings that are learned ignoring the structure of phone-grams. Furthermore, we have shown that our system provides complementary information to an acoustic system, improving it through the fusion of both systems.",

author = "David Romero and Christian Salamea",

note = "Publisher Copyright: {\textcopyright} 2021, The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. Copyright: Copyright 2020 Elsevier B.V., All rights reserved.; 11th International Workshop on Spoken Dialogue Systems, IWSDS 2020 ; Conference date: 21-09-2020 Through 23-09-2020",

year = "2021",

doi = "10.1007/978-981-15-8395-7_25",

language = "English",

isbn = "9789811583940",

series = "Lecture Notes in Electrical Engineering",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "339--348",

editor = "D{\textquoteright}Haro, {Luis Fernando} and Zoraida Callejas and Satoshi Nakamura",

booktitle = "Conversational Dialogue Systems for the Next Decade, IWSDS 2020",

address = "Germany",

}

Romero, D & Salamea, C 2021, On the Use of Phonotactic Vector Representations with FastText for Language Identification. in LF D’Haro, Z Callejas & S Nakamura (eds), Conversational Dialogue Systems for the Next Decade, IWSDS 2020. Lecture Notes in Electrical Engineering, vol. 704, Springer Science and Business Media Deutschland GmbH, pp. 339-348, 11th International Workshop on Spoken Dialogue Systems, IWSDS 2020, Madrid, Spain, 21/09/20. https://doi.org/10.1007/978-981-15-8395-7_25

On the Use of Phonotactic Vector Representations with FastText for Language Identification. / Romero, David; Salamea, Christian.
Conversational Dialogue Systems for the Next Decade, IWSDS 2020. ed. / Luis Fernando D’Haro; Zoraida Callejas; Satoshi Nakamura. Springer Science and Business Media Deutschland GmbH, 2021. p. 339-348 (Lecture Notes in Electrical Engineering; Vol. 704).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - On the Use of Phonotactic Vector Representations with FastText for Language Identification

AU - Romero, David

AU - Salamea, Christian

N1 - Publisher Copyright: © 2021, The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. Copyright: Copyright 2020 Elsevier B.V., All rights reserved.

PY - 2021

Y1 - 2021

N2 - This paper explores a better way to learn word vector representations for language identification (LID). We have focused on a phonotactic approach using phoneme sequences in order to make phonotactic units (phone-grams) to incorporate context information. In order to take into consideration the morphology of phone-grams, we have considered the use of sub-word information (lower-order n-grams) to learn phone-grams embeddings using FastText. These embeddings are used as input to an i-Vector framework to train a multiclass logistic classifier. Our approach has been compared with a LID system that uses phone-gram embeddings learned through Skipgram that do not implement sub-word information, using Cavg as a metric for our experiments. Our approach to LID to incorporate sub-word information in phone-grams embeddings significantly improves the results obtained by using embeddings that are learned ignoring the structure of phone-grams. Furthermore, we have shown that our system provides complementary information to an acoustic system, improving it through the fusion of both systems.

AB - This paper explores a better way to learn word vector representations for language identification (LID). We have focused on a phonotactic approach using phoneme sequences in order to make phonotactic units (phone-grams) to incorporate context information. In order to take into consideration the morphology of phone-grams, we have considered the use of sub-word information (lower-order n-grams) to learn phone-grams embeddings using FastText. These embeddings are used as input to an i-Vector framework to train a multiclass logistic classifier. Our approach has been compared with a LID system that uses phone-gram embeddings learned through Skipgram that do not implement sub-word information, using Cavg as a metric for our experiments. Our approach to LID to incorporate sub-word information in phone-grams embeddings significantly improves the results obtained by using embeddings that are learned ignoring the structure of phone-grams. Furthermore, we have shown that our system provides complementary information to an acoustic system, improving it through the fusion of both systems.

UR - http://www.scopus.com/inward/record.url?scp=85096424819&partnerID=8YFLogxK

U2 - 10.1007/978-981-15-8395-7_25

DO - 10.1007/978-981-15-8395-7_25

M3 - Conference contribution

AN - SCOPUS:85096424819

SN - 9789811583940

T3 - Lecture Notes in Electrical Engineering

SP - 339

EP - 348

BT - Conversational Dialogue Systems for the Next Decade, IWSDS 2020

A2 - D’Haro, Luis Fernando

A2 - Callejas, Zoraida

A2 - Nakamura, Satoshi

PB - Springer Science and Business Media Deutschland GmbH

T2 - 11th International Workshop on Spoken Dialogue Systems, IWSDS 2020

Y2 - 21 September 2020 through 23 September 2020

ER -

Romero D, Salamea C. On the Use of Phonotactic Vector Representations with FastText for Language Identification. In D’Haro LF, Callejas Z, Nakamura S, editors, Conversational Dialogue Systems for the Next Decade, IWSDS 2020. Springer Science and Business Media Deutschland GmbH. 2021. p. 339-348. (Lecture Notes in Electrical Engineering). doi: 10.1007/978-981-15-8395-7_25

On the Use of Phonotactic Vector Representations with FastText for Language Identification

Abstract

Publication series

Conference

Bibliographical note

Access to Document

Other files and links

Fingerprint

Cite this