Machine Learning
Lexical cohesion computed by thesaural relations as an indicator of the structure of text
Computational Linguistics
Lexical and Discourse Analysis of Online Chat Dialog
ICSC '07 Proceedings of the International Conference on Semantic Computing
Automatically profiling the author of an anonymous text
Communications of the ACM - Inspiring Women in Computing
SemEval-2007 task 14: affective text
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Using information content to evaluate semantic similarity in a taxonomy
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Improving word sense disambiguation in lexical chaining
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Predicting age and gender in online social networks
Proceedings of the 3rd international workshop on Search and mining user-generated contents
Learning to Identify Internet Sexual Predation
International Journal of Electronic Commerce
Modelling fixated discourse in chats with cyberpedophiles
EACL 2012 Proceedings of the Workshop on Computational Approaches to Deception Detection
Hi-index | 0.00 |
In this paper, we suggest a list of high-level features and study their applicability in detection of cyberpedophiles. We used a corpus of chats downloaded from http://www.perverted-justice.com and two negative datasets of different nature: cybersex logs available online, and the NPS chat corpus. The classification results show that the NPS data and the pedophiles' conversations can be accurately discriminated from each other with character n-grams, while in the more complicated case of cybersex logs there is need for high-level features to reach good accuracy levels. In this latter setting our results show that features that model behaviour and emotion significantly outperform the low-level ones, and achieve a 97% accuracy.