Webpage Importance Analysis Using Conditional Markov Random Walk

Authors:
Tie-Yan Liu;Wei-Ying Ma
Affiliations:
Microsoft Research Asia;Microsoft Research Asia
Venue:
WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
Year:
2005

Citing 0
Cited 5

Generalizing PageRank: damping functions for link-based ranking algorithms

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Multi-document summarization using cluster-based link analysis

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
An exploration of document impact on graph-based multi-document summarization

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Answering opinion questions with random walks on graphs

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Enhanced Information Retrieval by Exploiting Recommender Techniques in Cluster-Based Link Analysis

Proceedings of the 2013 Conference on the Theory of Information Retrieval

Quantified Score

Hi-index	0.02

Visualization

Abstract

In this paper, we propose a novel method to calculate the webpage importance based on a conditional Markov random walk model. The main assumption in this model is that given the hyperlinks in a webpage, users are not really randomly clicking one of them. Instead, many factors may bias their behaviors, for example, the anchor text, the content relevance and the previous experiences when visiting the website that a destination pages belongs to. As one of the results, the user might tend to visit those pages in high-quality websites with higher probability. To implement this idea, we reformulate the Web graph to be a two-layer structure, and the webpage importance is calculated by conditional random walk in this new Web graph. Experiments on the topic distillation task of TREC 2003 Web track showed that our new method can achieve about 18% improvement on mean average precision (MAP) and 16% on precision at 10 (P@10) over the PageRank algorithm.