Employing document dependency in blog search

Authors:
Mostafa Keikha;Fabio Crestani;Mark James Carman
Affiliations:
Faculty of Informatics, University of Lugano, Switzerland;Faculty of Informatics, University of Lugano, Switzerland;Faculty of IT, Monash University, Australia
Venue:
Journal of the American Society for Information Science and Technology
Year:
2012

Citing 27
Cited 0

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Exploiting the Similarity of Non-Matching Terms at RetrievalTime

Information Retrieval
The unified probabilistic model for IR

SIGIR '82 Proceedings of the 5th annual ACM conference on Research and development in information retrieval
Time-based language models

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
A study of smoothing methods for language models applied to information retrieval

ACM Transactions on Information Systems (TOIS)
Server selection methods in hybrid portal search

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
PageRank without hyperlinks: structural re-ranking using links induced by language models

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Regularizing ad hoc retrieval scores

Proceedings of the 14th ACM international conference on Information and knowledge management
Voting for candidates: adapting data fusion techniques for an expert search task

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Temporal profiles of queries

ACM Transactions on Information Systems (TOIS)
Random walks on the click graph

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Retrieval and feedback models for blog feed search

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A general optimization framework for smoothing language models on graph structures

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Searching blogs and news: a study on popular queries

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Bloggers as experts: feed distillation using expert retrieval models

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Modeling expert finding as an absorbing random walk

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Key blog distillation: ranking aggregates

Proceedings of the 17th ACM conference on Information and knowledge management
Blog site search using resource selection

Proceedings of the 17th ACM conference on Information and knowledge management
Dr. Searcher and Mr. Browser: a unified hyperlink-click graph

Proceedings of the 17th ACM conference on Information and knowledge management
Modeling multi-step relevance propagation for expert finding

Proceedings of the 17th ACM conference on Information and knowledge management
Finding Key Bloggers, One Post At A Time

Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Scaling up semi-supervised learning: an efficient and effective LLGC variant

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
The task-dependent effect of tags and ratings on social media access

ACM Transactions on Information Systems (TOIS)
Estimation methods for ranking recent information

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Statistics of online user-generated short documents

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Hierarchical language models for XML component retrieval

INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval
A study of blog search

ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

The goal in blog search is to rank blogs according to their recurrent relevance to the topic of the query. State-of-the-art approaches view it as an expert search or resource selection problem. We investigate the effect of content-based similarity between posts on the performance of the retrieval system. We test two different approaches for smoothing (regularizing) relevance scores of posts based on their dependencies. In the first approach, we smooth term distributions describing posts by performing a random walk over a document-term graph in which similar posts are highly connected. In the second, we directly smooth scores for posts using a regularization framework that aims to minimize the discrepancy between scores for similar documents. We then extend these approaches to consider the time interval between the posts in smoothing the scores. The idea is that if two posts are temporally close, then they are good sources for smoothing each other's relevance scores. We compare these methods with the state-of-the-art approaches in blog search that employ Language Modeling-based resource selection algorithms and fusion-based methods for aggregating post relevance scores. We show performance gains over the baseline techniques which do not take advantage of the relation between posts for smoothing relevance estimates. © 2012 Wiley Periodicals, Inc.