A Content-Based System for Discrimination Detection of Ecuadorian Text

Diego Vallejo-Huanga, Alexis Vallejo, Gustavo Contreras

Producción científica: Capítulo del libro/informe/acta de congresoContribución de conferenciarevisión exhaustiva

Resumen

The increase in Internet users and the widespread use of social networks has led to higher rates of discrimination. The lack of technological tools and the inability of social networks to identify cyberbullying has generated psychological affection in minority social groups. This research compiled a glossary of segregative terms in the Ecuadorian context, according to four types of discrimination defined by the United Nations. The discriminatory dictionary was constructed using information from various blogs, research papers, and texts containing segregative expressions. This dictionary allowed the development of a content-based software prototype, using NLP-derived techniques and similarity metrics, to classify the type and degree of discrimination of texts in the Ecuadorian context. The prototype includes a web user interface, which allows for analyzing the corpus of a text and shows the degree of discrimination according to its taxonomy. The prototype’s performance was validated through functional tests by experimenting with a dataset of 40 sentences labeled as discriminatory and non-discriminatory. The experimental process used the pre-trained artificial intelligence ChatGPT model as an external validation method. The experiments showed that the cosine vector approach of the proposed prototype has an accuracy of 88%, in contrast to 50% of ChatGPT, for classifying a discriminative text in the Ecuadorian domain.

Idioma originalInglés
Título de la publicación alojadaIntelligent Systems and Applications - Proceedings of the 2024 Intelligent Systems Conference IntelliSys Volume 4
EditoresKohei Arai
EditorialSpringer Science and Business Media Deutschland GmbH
Páginas284-299
Número de páginas16
ISBN (versión impresa)9783031663352
DOI
EstadoPublicada - 2024
EventoIntelligent Systems Conference, IntelliSys 2024 - Amsterdam, Países Bajos
Duración: 5 sep. 20246 sep. 2024

Serie de la publicación

NombreLecture Notes in Networks and Systems
Volumen1068 LNNS
ISSN (versión impresa)2367-3370
ISSN (versión digital)2367-3389

Conferencia

ConferenciaIntelligent Systems Conference, IntelliSys 2024
País/TerritorioPaíses Bajos
CiudadAmsterdam
Período5/09/246/09/24

Nota bibliográfica

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

Citar esto