A study of smoothing methods for language models applied to information retrieval
ACM Transactions on Information Systems (TOIS)
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Blogger, stick to your story: modeling topical noise in blogs with coherence measures
Proceedings of the second workshop on Analytics for noisy unstructured text data
A language modeling framework for expert finding
Information Processing and Management: an International Journal
A two-stage model for blog feed search
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
ACM SIGIR Forum
Relevance stability in blog retrieval
Proceedings of the 2011 ACM Symposium on Applied Computing
TEMPER: a temporal relevance feedback method
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Blog feed search with a post index
Information Retrieval
Utilizing local evidence for blog feed search
Information Retrieval
Employing document dependency in blog search
Journal of the American Society for Information Science and Technology
Hi-index | 0.00 |
User generated content in general, and blogs in particular, form an interesting and relatively little explored domain for mining knowledge. We address the task of blog distillation: to find blogs that are principally devoted to a given topic, as opposed to blogs that merely happen to discuss the topic in passing. Working in the setting of statistical language modeling, we model the task by aggregating a blogger's blog posts to collect evidence of relevance to the topic and persistence of interest in the topic. This approach achieves state-of-the-art performance. On top of this baseline, we extend our model by incorporating a number of blog-specific features, concerning document structure, social structure, and temporal structure. These blog-specific features yield further improvements.