Detecting cyberbullying: query terms and techniques

Authors:
April Kontostathis;Kelly Reynolds;Andy Garron;Lynne Edwards
Affiliations:
Ursinus College, Collegeville PA;Lehigh University, Bethlehem PA;University of Maryland, College Park, MD;Ursinus College, Collegeville PA
Venue:
Proceedings of the 5th Annual ACM Web Science Conference
Year:
2013

Citing 11
Cited 0

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Using linear algebra for intelligent information retrieval

SIAM Review
Information Retrieval

Information Retrieval
Essential Dimensions of Latent Semantic Indexing (LSI)

HICSS '07 Proceedings of the 40th Annual Hawaii International Conference on System Sciences
Cyberbullying and Cyberthreats: Responding to the Challenge of Online Social Aggression, Threats, and Distress

Cyberbullying and Cyberthreats: Responding to the Challenge of Online Social Aggression, Threats, and Distress
A framework for understanding Latent Semantic Indexing (LSI) performance

Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Prospectus for the next LAPACK and ScaLAPACK libraries

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Modern Information Retrieval

Modern Information Retrieval
Learning to Identify Internet Sexual Predation

International Journal of Electronic Commerce
Using Machine Learning to Detect Cyberbullying

ICMLA '11 Proceedings of the 2011 10th International Conference on Machine Learning and Applications and Workshops - Volume 02
Learning from bullying traces in social media

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we describe a close analysis of the language used in cyberbullying. We take as our corpus a collection of posts from Formspring.me. Formspring.me is a social networking site where users can ask questions of other users. It appeals primarily to teens and young adults and the cyberbullying content on the site is dense; between 7% and 14% of the posts we have analyzed contain cyberbullying content. The results presented in this article are two-fold. Our first experiments were designed to develop an understanding of both the specific words that are used by cyberbullies, and the context surrounding these words. We have identified the most commonly used cyberbullying terms, and have developed queries that can be used to detect cyberbullying content. Five of our queries achieve an average precision of 91.25% at rank 100. In our second set of experiments we extended this work by using a supervised machine learning approach for detecting cyberbullying. The machine learning experiments identify additional terms that are consistent with cyberbullying content, and identified an additional querying technique that was able to accurately assign scores to posts from Formspring.me. The posts with the highest scores are shown to have a high density of cyberbullying content.