A vector space model for automatic indexing
Communications of the ACM
Modern Information Retrieval
The Journal of Machine Learning Research
A reference collection for web spam
ACM SIGIR Forum
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Introduction to Information Retrieval
Introduction to Information Retrieval
Email Spam Filtering: A Systematic Review
Foundations and Trends in Information Retrieval
A survey of learning-based techniques of email spam filtering
Artificial Intelligence Review
Understanding the value of features for coreference resolution
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Computing semantic relatedness using Wikipedia-based explicit semantic analysis
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Foundations and Trends in Information Retrieval
Going beyond Corr-LDA for detecting specific comments on news & blogs
Proceedings of the 7th ACM international conference on Web search and data mining
Hi-index | 0.00 |
An important issue that has been neglected so far is the identification of diversionary comments. Diversionary comments under political blog posts are defined as comments that deliberately twist the bloggers' intention and divert the topic to another one. The purpose is to distract readers from the original topic and draw attention to a new topic. Given that political blogs have significant impact on the society, we believe it is imperative to identify such comments. We then categorize diversionary comments into 5 types, and propose an effective technique to rank comments in descending order of being diversionary. To the best of our knowledge, the problem of detecting diversionary comments has not been studied so far. Our evaluation on 2,109 comments under 20 different blog posts from Digg.com shows that the proposed method achieves the high mean average precision (MAP) of 92.6%. Sensitivity analysis indicates that the effectiveness of the method is stable under different parameter settings.