Ir directamente a la navegación principal Ir directamente a la búsqueda Ir directamente al contenido principal

Sequence-to-Sequence Spanish Pre-trained Language Models

  • Vladimir Araujo
  • , Maria Mihaela Trusca
  • , Rodrigo Tufiño
  • , Marie Francine Moens

Producción científica: Capítulo del libro/informe/acta de congresoContribución de conferenciarevisión exhaustiva

Resumen

In recent years, significant advancements in pre-trained language models have driven the creation of numerous non-English language variants, with a particular emphasis on encoder-only and decoder-only architectures. While Spanish language models based on BERT and GPT have demonstrated proficiency in natural language understanding and generation, there remains a noticeable scarcity of encoder-decoder models explicitly designed for sequence-to-sequence tasks, which aim to map input sequences to generate output sequences conditionally. This paper breaks new ground by introducing the implementation and evaluation of renowned encoder-decoder architectures exclusively pre-trained on Spanish corpora. Specifically, we present Spanish versions of BART, T5, and BERT2BERT-style models and subject them to a comprehensive assessment across various sequence-to-sequence tasks, including summarization, question answering, split-and-rephrase, dialogue, and translation. Our findings underscore the competitive performance of all models, with the BART- and T5-based models emerging as top performers across all tasks. We have made all models publicly available to the research community to foster future explorations and advancements in Spanish NLP: https://github.com/vgaraujov/Seq2Seq-Spanish-PLMs.

Idioma originalInglés
Título de la publicación alojada2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings
EditoresNicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
EditorialEuropean Language Resources Association (ELRA)
Páginas14729-14743
Número de páginas15
ISBN (versión digital)9782493814104
EstadoPublicada - 2024
EventoJoint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 - Hybrid, Torino, Italia
Duración: 20 may. 202425 may. 2024

Serie de la publicación

Nombre2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings

Conferencia

ConferenciaJoint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024
País/TerritorioItalia
CiudadHybrid, Torino
Período20/05/2425/05/24

Nota bibliográfica

Publisher Copyright:
© 2024 ELRA Language Resource Association: CC BY-NC 4.0.

Areas de Conocimiento del CACES

  • 116A Computación

Huella

Profundice en los temas de investigación de 'Sequence-to-Sequence Spanish Pre-trained Language Models'. En conjunto forman una huella única.

Citar esto