Emotion detection in suicide notes

Authors:
Bart Desmet;VéRonique Hoste
Affiliations:
LT3 Language and Translation Technology Team, University College Ghent, Groot-Brittanniëlaan 45, 9000 Ghent, Belgium and Department of Applied Mathematics and Computer Science, Ghent Universi ...;LT3 Language and Translation Technology Team, University College Ghent, Groot-Brittanniëlaan 45, 9000 Ghent, Belgium and Department of Linguistics, Ghent University, Blandijnberg 2, 9000 Ghen ...
Venue:
Expert Systems with Applications: An International Journal
Year:
2013

Citing 28
Cited 0

Making large-scale support vector machine learning practical

Advances in kernel methods
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Learning Subjective Adjectives from Corpora

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Mining the peanut gallery: opinion extraction and semantic classification of product reviews

WWW '03 Proceedings of the 12th international conference on World Wide Web
Measuring praise and criticism: Inference of semantic orientation from association

ACM Transactions on Information Systems (TOIS)
Predicting the semantic orientation of adjectives

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Mining and summarizing customer reviews

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Development and use of a gold-standard data set for subjectivity classifications

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Using appraisal groups for sentiment analysis

Proceedings of the 14th ACM international conference on Information and knowledge management
A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Recognizing contextual polarity in phrase-level sentiment analysis

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Hunting Suicide Notes in Web 2.0 - Preliminary Findings

ISMW '07 Proceedings of the Ninth IEEE International Symposium on Multimedia Workshops
A holistic lexicon-based approach to opinion mining

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Opinion Mining and Sentiment Analysis

Foundations and Trends in Information Retrieval
Topic identification for fine-grained opinion analysis

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Multilingual subjectivity analysis using machine translation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
SemEval-2007 task 14: affective text

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Memory-Based Language Processing

Memory-Based Language Processing
Dependency tree-based sentiment classification using CRFs with hidden variables

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Using anaphora resolution to improve opinion target identification in movie reviews

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Emotion analysis using latent affective folding and embedding

CAAGET '10 Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text
Classifying sentiment in microblogs: is brevity an advantage?

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Opinion word expansion and target extraction through double propagation

Computational Linguistics
Character confusion versus focus word-based correction of spelling and OCR variants in corpora

International Journal on Document Analysis and Recognition - Special issue on noisy text analytics
Discovering fine-grained sentiment with latent variable structured prediction models

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Constrained LDA for grouping product features in opinion mining

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Sentiment analysis of Twitter data

LSM '11 Proceedings of the Workshop on Languages in Social Media

Quantified Score

Hi-index	12.05

Visualization

Abstract

The success of suicide prevention, a major public health concern worldwide, hinges on adequate suicide risk assessment. Online platforms are increasingly used for expressing suicidal thoughts, but manual monitoring is unfeasible given the information overload experts are confronted with. We investigate whether the recent advances in natural language processing, and more specifically in sentiment mining, can be used to accurately pinpoint 15 different emotions, which might be indicative of suicidal behavior. A system for automatic emotion detection was built using binary support vector machine classifiers. We hypothesized that lexical and semantic features could be an adequate way to represent the data, as emotions seemed to be lexicalized consistently. The optimal feature combination for each of the different emotions was determined using bootstrap resampling. Spelling correction was applied to the input data, in order to reduce lexical variation. Classification performance varied between emotions, with scores up to 68.86% F-score. F-scores above 40% were achieved for six of the seven most frequent emotions: thankfulness, guilt, love, information, hopelessness and instructions. The most salient features are trigram and lemma bags-of-words and subjectivity clues. Spelling correction had a slightly positive effect on classification performance. We showed that fine-grained automatic emotion detection benefits from classifier optimization and a combined lexico-semantic feature representation. The modest performance improvements obtained through spelling correction might indicate the robustness of the system to noisy input text. We conclude that natural language processing techniques have future application potential for suicide prevention.