Abstract
In recent years, significant advancements in pre-trained language models have driven the creation of numerous non-English language variants, with a particular emphasis on encoder-only and decoder-only architectures. While Spanish language models based on BERT and GPT have demonstrated proficiency in natural language understanding and generation, there remains a noticeable scarcity of encoder-decoder models explicitly designed for sequence-to-sequence tasks, which aim to map input sequences to generate output sequences conditionally. This paper breaks new ground by introducing the implementation and evaluation of renowned encoder-decoder architectures exclusively pre-trained on Spanish corpora. Specifically, we present Spanish versions of BART, T5, and BERT2BERT-style models and subject them to a comprehensive assessment across various sequence-to-sequence tasks, including summarization, question answering, split-and-rephrase, dialogue, and translation. Our findings underscore the competitive performance of all models, with the BART- and T5-based models emerging as top performers across all tasks. We have made all models publicly available to the research community to foster future explorations and advancements in Spanish NLP: https://github.com/vgaraujov/Seq2Seq-Spanish-PLMs.
| Original language | English |
|---|---|
| Title of host publication | 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings |
| Editors | Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue |
| Publisher | European Language Resources Association (ELRA) |
| Pages | 14729-14743 |
| Number of pages | 15 |
| ISBN (Electronic) | 9782493814104 |
| State | Published - 2024 |
| Event | Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 - Hybrid, Torino, Italy Duration: 20 May 2024 → 25 May 2024 |
Publication series
| Name | 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings |
|---|
Conference
| Conference | Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 |
|---|---|
| Country/Territory | Italy |
| City | Hybrid, Torino |
| Period | 20/05/24 → 25/05/24 |
Bibliographical note
Publisher Copyright:© 2024 ELRA Language Resource Association: CC BY-NC 4.0.
Keywords
- generative models
- pre-trained language models
- sequence-to-sequence models
- transformer
CACES Knowledge Areas
- 116A Computer Science
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver