On the Use of Phonotactic Vector Representations with FastText for Language Identification

David Romero, Christian Salamea

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

This paper explores a better way to learn word vector representations for language identification (LID). We have focused on a phonotactic approach using phoneme sequences in order to make phonotactic units (phone-grams) to incorporate context information. In order to take into consideration the morphology of phone-grams, we have considered the use of sub-word information (lower-order n-grams) to learn phone-grams embeddings using FastText. These embeddings are used as input to an i-Vector framework to train a multiclass logistic classifier. Our approach has been compared with a LID system that uses phone-gram embeddings learned through Skipgram that do not implement sub-word information, using Cavg as a metric for our experiments. Our approach to LID to incorporate sub-word information in phone-grams embeddings significantly improves the results obtained by using embeddings that are learned ignoring the structure of phone-grams. Furthermore, we have shown that our system provides complementary information to an acoustic system, improving it through the fusion of both systems.

Original languageEnglish
Title of host publicationConversational Dialogue Systems for the Next Decade, IWSDS 2020
EditorsLuis Fernando D’Haro, Zoraida Callejas, Satoshi Nakamura
PublisherSpringer Science and Business Media Deutschland GmbH
Pages339-348
Number of pages10
ISBN (Print)9789811583940
DOIs
StatePublished - 2021
Event11th International Workshop on Spoken Dialogue Systems, IWSDS 2020 - Madrid, Spain
Duration: 21 Sep 202023 Sep 2020

Publication series

NameLecture Notes in Electrical Engineering
Volume704
ISSN (Print)1876-1100
ISSN (Electronic)1876-1119

Conference

Conference11th International Workshop on Spoken Dialogue Systems, IWSDS 2020
Country/TerritorySpain
CityMadrid
Period21/09/2023/09/20

Bibliographical note

Publisher Copyright:
© 2021, The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.

Fingerprint

Dive into the research topics of 'On the Use of Phonotactic Vector Representations with FastText for Language Identification'. Together they form a unique fingerprint.

Cite this