Finding Key Bloggers, One Post At A Time

Authors:
Wouter Weerkamp;Krisztian Balog;Maarten de Rijke
Affiliations:
ISLA, University of Amsterdam, The Netherlands. Email: weerkamp,kbalog,mdr@science.uva.nl;ISLA, University of Amsterdam, The Netherlands. Email: weerkamp,kbalog,mdr@science.uva.nl;ISLA, University of Amsterdam, The Netherlands. Email: weerkamp,kbalog,mdr@science.uva.nl
Venue:
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Year:
2008

Citing 2
Cited 9

A study of smoothing methods for language models applied to information retrieval

ACM Transactions on Information Systems (TOIS)
A study of blog search

ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval

Blogger, stick to your story: modeling topical noise in blogs with coherence measures

Proceedings of the second workshop on Analytics for noisy unstructured text data
A language modeling framework for expert finding

Information Processing and Management: an International Journal
A two-stage model for blog feed search

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Blog track research at TREC

ACM SIGIR Forum
Relevance stability in blog retrieval

Proceedings of the 2011 ACM Symposium on Applied Computing
TEMPER: a temporal relevance feedback method

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Blog feed search with a post index

Information Retrieval
Utilizing local evidence for blog feed search

Information Retrieval
Employing document dependency in blog search

Journal of the American Society for Information Science and Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

User generated content in general, and blogs in particular, form an interesting and relatively little explored domain for mining knowledge. We address the task of blog distillation: to find blogs that are principally devoted to a given topic, as opposed to blogs that merely happen to discuss the topic in passing. Working in the setting of statistical language modeling, we model the task by aggregating a blogger's blog posts to collect evidence of relevance to the topic and persistence of interest in the topic. This approach achieves state-of-the-art performance. On top of this baseline, we extend our model by incorporating a number of blog-specific features, concerning document structure, social structure, and temporal structure. These blog-specific features yield further improvements.