A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A hidden Markov model information retrieval system
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Title language model for information retrieval
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Analysis of anchor text for web search
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A study of smoothing methods for language models applied to information retrieval
ACM Transactions on Information Systems (TOIS)
Formal models for expert finding in enterprise corpora
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Retrieval and feedback models for blog feed search
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Bloggers as experts: feed distillation using expert retrieval models
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Information Retrieval
Introduction to Information Retrieval
Key blog distillation: ranking aggregates
Proceedings of the 17th ACM conference on Information and knowledge management
Blog site search using resource selection
Proceedings of the 17th ACM conference on Information and knowledge management
A language modeling framework for expert finding
Information Processing and Management: an International Journal
Finding Key Bloggers, One Post At A Time
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
An effective coherence measure to determine topical consistency in user-generated content
International Journal on Document Analysis and Recognition - Special Issue NOISY
A generative blog post retrieval model that uses query expansion based on external collections
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Credibility-inspired ranking for blog post retrieval
Information Retrieval
Foundations and Trends in Information Retrieval
Diversity in blog feed retrieval
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
User generated content forms an important domain for mining knowledge. In this paper, we address the task of blog feed search: to find blogs that are principally devoted to a given topic, as opposed to blogs that merely happen to mention the topic in passing. The large number of blogs makes the blogosphere a challenging domain, both in terms of effectiveness and of storage and retrieval efficiency. We examine the effectiveness of an approach to blog feed search that is based on individual posts as indexing units (instead of full blogs). Working in the setting of a probabilistic language modeling approach to information retrieval, we model the blog feed search task by aggregating over a blogger's posts to collect evidence of relevance to the topic and persistence of interest in the topic. This approach achieves state-of-the-art performance in terms of effectiveness. We then introduce a two-stage model where a pre-selection of candidate blogs is followed by a ranking step. The model integrates aggressive pruning techniques as well as very lean representations of the contents of blog posts, resulting in substantial gains in efficiency while maintaining effectiveness at a very competitive level.