Projects per year
Abstract
This study evaluates the effectiveness of the Continuous Bag-of-Words (CBOW) and Skip-gram models in capturing semantic relationships within Ecuadorian legal texts. Utilizing a comprehensive corpus that includes the Ecuadorian Constitution, the Comprehensive Organic Criminal Code (COIP), and the General Organic Code of Processes (COGEP), among other national laws, we analyze the models’ ability to represent the complex semantics of legal language. The CBOW model predicts target words based on their surrounding context, while Skip-gram predicts the context from a given target word, making them suitable for identifying intricate patterns in legal documents. A rigorous preprocessing phase was applied to the legal texts, including normalization, stopword removal, and lemmatization, ensuring high-quality input data for training. The models were then evaluated using semantic similarity (Spearman’s correlation) and topic coherence metrics. Results indicate that while both models show potential in capturing semantic relationships, CBOW demonstrated a marginally higher performance with a Spearman correlation of 0.24 and a topic coherence score of 0.6637, compared to Skip-gram’s 0.19 and 0.6573, respectively. Despite these findings, neither model fully captured the complexities inherent in legal language, suggesting a need for further refinement in NLP techniques for legal texts. These findings provide a foundation for improving semantic search and information retrieval systems tailored to the legal domain, offering tools to assist legal professionals in analyzing and understanding complex legal texts.
| Original language | English |
|---|---|
| Title of host publication | Smart Technologies, Systems and Applications - 4th International Conference, SmartTech-IC 2024, Revised Selected Papers |
| Editors | Fabián R. Narváez, Micaela N. Villa, Gloria M. Díaz |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 206-216 |
| Number of pages | 11 |
| ISBN (Print) | 9783031982866 |
| DOIs | |
| State | Published - 2026 |
| Event | 4th International Conference on Smart Technologies, Systems and Applications, SmartTech-IC 2024 - Quito, Ecuador Duration: 2 Dec 2024 → 4 Dec 2024 |
Publication series
| Name | Communications in Computer and Information Science |
|---|---|
| Volume | 2392 CCIS |
| ISSN (Print) | 1865-0929 |
| ISSN (Electronic) | 1865-0937 |
Conference
| Conference | 4th International Conference on Smart Technologies, Systems and Applications, SmartTech-IC 2024 |
|---|---|
| Country/Territory | Ecuador |
| City | Quito |
| Period | 2/12/24 → 4/12/24 |
Bibliographical note
Publisher Copyright:© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 16 Peace, Justice and Strong Institutions
Keywords
- AI in Law
- CBOW
- Ecuadorian Law
- Legal Texts
- Natural Language Processing
- Semantic Similarity
- Skip-gram
- Word Embeddings
Fingerprint
Dive into the research topics of 'Evaluating Word Embedding Models in Ecuadorian Legal Texts: A Comparison of CBOW and Skip-Gram for Semantic Analysis'. Together they form a unique fingerprint.Projects
- 1 Active
-
Design and Development of a Voice-to-Ecuadorian Sign Language Translator to Improve Inclusion of the Deaf in Ecuador
Salamea Palacios, C. R. (PI) & Viñanzaca Figueroa, F. J. (Student)
26/09/24 → …
Project: Research and Development
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver