Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
The Journal of Machine Learning Research
Building Text Classifiers Using Positive and Unlabeled Examples
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Probabilistic author-topic models for information discovery
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Methods for the qualitative evaluation of lexical association measures
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Topic modeling: beyond bag-of-words
ICML '06 Proceedings of the 23rd international conference on Machine learning
LDA-based document models for ad-hoc retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Combining association measures for collocation extraction
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Automatic labeling of multinomial topic models
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Topical N-Grams: Phrase and Topic Discovery, with an Application to Information Retrieval
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Search Engines: Information Retrieval in Practice
Search Engines: Information Retrieval in Practice
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
The twitter mute button: a web filtering challenge
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Hi-index | 0.00 |
Given a movie comment, does it contain a spoiler? A spoiler is a comment that, when disclosed, would ruin a surprise or reveal an important plot detail. We study automatic methods to detect comments and reviews that contain spoilers and apply them to reviews from the IMDB (Internet Movie Database) website. We develop topic models, based on Latent Dirichlet Allocation (LDA), but using linguistic dependency information in place of simple features from bag of words (BOW) representations. Experimental results demonstrate the effectiveness of our technique over four movie-comment datasets of different scales.