Presumptive Detection of Cyberbullying on Twitter through Natural Language Processing and Machine Learning in the Spanish Language

Gabriel A. Leon-Paredes; Wilson F. Palomeque-Leon; Pablo L. Gallegos-Segovia; Paul E. Vintimilla-Tapia; Jack F. Bravo-Torres; Liliana I. Barbosa-Santillan; Maria M. Paredes-Pinos

doi:10.1109/CHILECON47746.2019.8987684

Presumptive Detection of Cyberbullying on Twitter through Natural Language Processing and Machine Learning in the Spanish Language

Gabriel A. Leon-Paredes, Wilson F. Palomeque-Leon, Pablo L. Gallegos-Segovia, Paul E. Vintimilla-Tapia, Jack F. Bravo-Torres, Liliana I. Barbosa-Santillan, Maria M. Paredes-Pinos

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

21 Scopus citations

Abstract

Nowadays, the constant development of information and communication technologies (ICTs) has changed the inter-personal interaction, allowing to transfer real experiences to a virtualized medium such as Internet. In this sense, although the space-time barriers of traditional communication are broken and social relationships are strengthened, problems related to adverse behaviors may arise. Bullying, defined as an act that threatens a person's holistic well-being, becomes cyberbullying when it is done over Internet, causing anxiety problems, depression and even suicide attempts. For this reason, it is essential to detect this type of behaviour in time. This research deploys a Spanish cyberbullying prevention system (SPC), which relies on Natural Language Processing (NLP) methods and different machine learning techniques (Naive Bayes, Support Vector Machine and Logistic Regression), using Twitter as the basis for the extraction of knowledge bases or corpus. Several precision metrics and variable corpus sizes are used for the training. The learning results reach a maximum accuracy of 93%, verified through the application of three study cases.

Original language	English
Title of host publication	IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	9781728131856
DOIs	https://doi.org/10.1109/CHILECON47746.2019.8987684
State	Published - Nov 2019
Event	2019 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019 - Valparaiso, Chile Duration: 13 Nov 2019 → 27 Nov 2019

Publication series

Name	IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019

Conference

Conference	2019 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019
Country/Territory	Chile
City	Valparaiso
Period	13/11/19 → 27/11/19

Bibliographical note

Publisher Copyright:
© 2019 IEEE.

Keywords

Cyberbullying
Expert System
Natural Language Processing
Semantics
Sentiment Analysis
Spanish Language Processing

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1109/CHILECON47746.2019.8987684

Cite this

Leon-Paredes, G. A., Palomeque-Leon, W. F., Gallegos-Segovia, P. L., Vintimilla-Tapia, P. E., Bravo-Torres, J. F., Barbosa-Santillan, L. I., & Paredes-Pinos, M. M. (2019). Presumptive Detection of Cyberbullying on Twitter through Natural Language Processing and Machine Learning in the Spanish Language. In IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019 Article 8987684 (IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/CHILECON47746.2019.8987684

Leon-Paredes, Gabriel A. ; Palomeque-Leon, Wilson F. ; Gallegos-Segovia, Pablo L. et al. / Presumptive Detection of Cyberbullying on Twitter through Natural Language Processing and Machine Learning in the Spanish Language. IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019. Institute of Electrical and Electronics Engineers Inc., 2019. (IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019).

@inproceedings{d2b7482804434b38a025e13ec9a46fc4,

title = "Presumptive Detection of Cyberbullying on Twitter through Natural Language Processing and Machine Learning in the Spanish Language",

abstract = "Nowadays, the constant development of information and communication technologies (ICTs) has changed the inter-personal interaction, allowing to transfer real experiences to a virtualized medium such as Internet. In this sense, although the space-time barriers of traditional communication are broken and social relationships are strengthened, problems related to adverse behaviors may arise. Bullying, defined as an act that threatens a person's holistic well-being, becomes cyberbullying when it is done over Internet, causing anxiety problems, depression and even suicide attempts. For this reason, it is essential to detect this type of behaviour in time. This research deploys a Spanish cyberbullying prevention system (SPC), which relies on Natural Language Processing (NLP) methods and different machine learning techniques (Naive Bayes, Support Vector Machine and Logistic Regression), using Twitter as the basis for the extraction of knowledge bases or corpus. Several precision metrics and variable corpus sizes are used for the training. The learning results reach a maximum accuracy of 93%, verified through the application of three study cases.",

keywords = "Cyberbullying, Expert System, Natural Language Processing, Semantics, Sentiment Analysis, Spanish Language Processing",

author = "Leon-Paredes, {Gabriel A.} and Palomeque-Leon, {Wilson F.} and Gallegos-Segovia, {Pablo L.} and Vintimilla-Tapia, {Paul E.} and Bravo-Torres, {Jack F.} and Barbosa-Santillan, {Liliana I.} and Paredes-Pinos, {Maria M.}",

note = "Publisher Copyright: {\textcopyright} 2019 IEEE.; 2019 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019 ; Conference date: 13-11-2019 Through 27-11-2019",

year = "2019",

month = nov,

doi = "10.1109/CHILECON47746.2019.8987684",

language = "English",

series = "IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019",

address = "United States",

}

Leon-Paredes, GA, Palomeque-Leon, WF, Gallegos-Segovia, PL, Vintimilla-Tapia, PE, Bravo-Torres, JF, Barbosa-Santillan, LI & Paredes-Pinos, MM 2019, Presumptive Detection of Cyberbullying on Twitter through Natural Language Processing and Machine Learning in the Spanish Language. in IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019., 8987684, IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019, Institute of Electrical and Electronics Engineers Inc., 2019 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019, Valparaiso, Chile, 13/11/19. https://doi.org/10.1109/CHILECON47746.2019.8987684

Presumptive Detection of Cyberbullying on Twitter through Natural Language Processing and Machine Learning in the Spanish Language. / Leon-Paredes, Gabriel A.; Palomeque-Leon, Wilson F.; Gallegos-Segovia, Pablo L. et al.
IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019. Institute of Electrical and Electronics Engineers Inc., 2019. 8987684 (IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Presumptive Detection of Cyberbullying on Twitter through Natural Language Processing and Machine Learning in the Spanish Language

AU - Leon-Paredes, Gabriel A.

AU - Palomeque-Leon, Wilson F.

AU - Gallegos-Segovia, Pablo L.

AU - Vintimilla-Tapia, Paul E.

AU - Bravo-Torres, Jack F.

AU - Barbosa-Santillan, Liliana I.

AU - Paredes-Pinos, Maria M.

PY - 2019/11

Y1 - 2019/11

N2 - Nowadays, the constant development of information and communication technologies (ICTs) has changed the inter-personal interaction, allowing to transfer real experiences to a virtualized medium such as Internet. In this sense, although the space-time barriers of traditional communication are broken and social relationships are strengthened, problems related to adverse behaviors may arise. Bullying, defined as an act that threatens a person's holistic well-being, becomes cyberbullying when it is done over Internet, causing anxiety problems, depression and even suicide attempts. For this reason, it is essential to detect this type of behaviour in time. This research deploys a Spanish cyberbullying prevention system (SPC), which relies on Natural Language Processing (NLP) methods and different machine learning techniques (Naive Bayes, Support Vector Machine and Logistic Regression), using Twitter as the basis for the extraction of knowledge bases or corpus. Several precision metrics and variable corpus sizes are used for the training. The learning results reach a maximum accuracy of 93%, verified through the application of three study cases.

AB - Nowadays, the constant development of information and communication technologies (ICTs) has changed the inter-personal interaction, allowing to transfer real experiences to a virtualized medium such as Internet. In this sense, although the space-time barriers of traditional communication are broken and social relationships are strengthened, problems related to adverse behaviors may arise. Bullying, defined as an act that threatens a person's holistic well-being, becomes cyberbullying when it is done over Internet, causing anxiety problems, depression and even suicide attempts. For this reason, it is essential to detect this type of behaviour in time. This research deploys a Spanish cyberbullying prevention system (SPC), which relies on Natural Language Processing (NLP) methods and different machine learning techniques (Naive Bayes, Support Vector Machine and Logistic Regression), using Twitter as the basis for the extraction of knowledge bases or corpus. Several precision metrics and variable corpus sizes are used for the training. The learning results reach a maximum accuracy of 93%, verified through the application of three study cases.

KW - Cyberbullying

KW - Expert System

KW - Natural Language Processing

KW - Semantics

KW - Sentiment Analysis

KW - Spanish Language Processing

UR - http://www.scopus.com/inward/record.url?scp=85081067265&partnerID=8YFLogxK

U2 - 10.1109/CHILECON47746.2019.8987684

DO - 10.1109/CHILECON47746.2019.8987684

M3 - Conference contribution

AN - SCOPUS:85081067265

T3 - IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019

BT - IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2019 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019

Y2 - 13 November 2019 through 27 November 2019

ER -

Leon-Paredes GA, Palomeque-Leon WF, Gallegos-Segovia PL, Vintimilla-Tapia PE, Bravo-Torres JF, Barbosa-Santillan LI et al. Presumptive Detection of Cyberbullying on Twitter through Natural Language Processing and Machine Learning in the Spanish Language. In IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019. Institute of Electrical and Electronics Engineers Inc. 2019. 8987684. (IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, CHILECON 2019). doi: 10.1109/CHILECON47746.2019.8987684

Presumptive Detection of Cyberbullying on Twitter through Natural Language Processing and Machine Learning in the Spanish Language

Abstract

Publication series

Conference

Bibliographical note

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this