Uso de técnicas basadas en one-shot learning para la identificación del locutor

Translated title of the contribution: Speaker identification using techniques based on one-shot learning

Juan Chica, Christian Salamea

Research output: Contribution to journalArticlepeer-review


A speaker identification system in order to be effective requires a large number of audio samples of each speaker, which are not always accessible or easy to collect. In contrast, systems based on meta-learning like one-shot learning, use a single sample to differentiate between classes. This work evaluates the potential of applying the meta-learning approach to text-independent speaker identification tasks. In the experimentation mel spectrogram, i-vectors and resample (downsampling) are used to both process the audio signal and to obtain a feature vector. This feature vector is the input of a siamese neural network that is responsible for performing the identification task. The best result was obtained by differentiating between 4 speakers with an accuracy of 0.9. The obtained results show that one-shot learning approaches have great potential to be used speaker identification and could be very useful in a real field like biometrics or forensic because of its versatility.

Translated title of the contributionSpeaker identification using techniques based on one-shot learning
Original languageSpanish
Pages (from-to)101-108
Number of pages8
JournalProcesamiento de Lenguaje Natural
StatePublished - Mar 2020

Bibliographical note

Publisher Copyright:
© 2020 Sociedad Espanola para el Procesamiento del Lenguaje Natural. All rights reserved.

Copyright 2020 Elsevier B.V., All rights reserved.


Dive into the research topics of 'Speaker identification using techniques based on one-shot learning'. Together they form a unique fingerprint.

Cite this