Ir directamente a la navegación principal Ir directamente a la búsqueda Ir directamente al contenido principal

Evaluating Word Embedding Models in Ecuadorian Legal Texts: A Comparison of CBOW and Skip-Gram for Semantic Analysis

Producción científica: Capítulo del libro/informe/acta de congresoContribución de conferenciarevisión exhaustiva

Resumen

This study evaluates the effectiveness of the Continuous Bag-of-Words (CBOW) and Skip-gram models in capturing semantic relationships within Ecuadorian legal texts. Utilizing a comprehensive corpus that includes the Ecuadorian Constitution, the Comprehensive Organic Criminal Code (COIP), and the General Organic Code of Processes (COGEP), among other national laws, we analyze the models’ ability to represent the complex semantics of legal language. The CBOW model predicts target words based on their surrounding context, while Skip-gram predicts the context from a given target word, making them suitable for identifying intricate patterns in legal documents. A rigorous preprocessing phase was applied to the legal texts, including normalization, stopword removal, and lemmatization, ensuring high-quality input data for training. The models were then evaluated using semantic similarity (Spearman’s correlation) and topic coherence metrics. Results indicate that while both models show potential in capturing semantic relationships, CBOW demonstrated a marginally higher performance with a Spearman correlation of 0.24 and a topic coherence score of 0.6637, compared to Skip-gram’s 0.19 and 0.6573, respectively. Despite these findings, neither model fully captured the complexities inherent in legal language, suggesting a need for further refinement in NLP techniques for legal texts. These findings provide a foundation for improving semantic search and information retrieval systems tailored to the legal domain, offering tools to assist legal professionals in analyzing and understanding complex legal texts.

Idioma originalInglés
Título de la publicación alojadaSmart Technologies, Systems and Applications - 4th International Conference, SmartTech-IC 2024, Revised Selected Papers
EditoresFabián R. Narváez, Micaela N. Villa, Gloria M. Díaz
EditorialSpringer Science and Business Media Deutschland GmbH
Páginas206-216
Número de páginas11
ISBN (versión impresa)9783031982866
DOI
EstadoPublicada - 2026
Evento4th International Conference on Smart Technologies, Systems and Applications, SmartTech-IC 2024 - Quito, Ecuador
Duración: 2 dic. 20244 dic. 2024

Serie de la publicación

NombreCommunications in Computer and Information Science
Volumen2392 CCIS
ISSN (versión impresa)1865-0929
ISSN (versión digital)1865-0937

Conferencia

Conferencia4th International Conference on Smart Technologies, Systems and Applications, SmartTech-IC 2024
País/TerritorioEcuador
CiudadQuito
Período2/12/244/12/24

Nota bibliográfica

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.

ODS de las Naciones Unidas

Este resultado contribuye a los siguientes Objetivos de Desarrollo Sostenible

  1. ODS 16: Paz, justicia e instituciones sólidas
    ODS 16: Paz, justicia e instituciones sólidas

Citar esto