MFCRank: a web ranking algorithm based on correlation of multiple features

Authors:
Yunming Ye;Yan Li;Xiaofei Xu;Joshua Huang;Xiaojun Chen
Affiliations:
Harbin Institute of Technology, Shenzhen Graduate School, Shenzhen, China;Harbin Institute of Technology, Shenzhen Graduate School, Shenzhen, China;Harbin Institute of Technology, Shenzhen Graduate School, Shenzhen, China;E-Business Technology Institute, The University of Hong Kong, Hong Kong;Harbin Institute of Technology, Shenzhen Graduate School, Shenzhen, China
Venue:
CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Year:
2006

Citing 11
Cited 0

Automatic resource compilation by analyzing hyperlink structure and associated text

WWW7 Proceedings of the seventh international conference on World Wide Web 7
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Topic-sensitive PageRank

Proceedings of the 11th international conference on World Wide Web
Modern Information Retrieval

Modern Information Retrieval
SimRank: a measure of structural-context similarity

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Scaling personalized web search

WWW '03 Proceedings of the 12th international conference on World Wide Web
Analysis of anchor text for web search

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
CALA: a web analysis algorithm combined with content correlation analysis method

Journal of Computer Science and Technology
Mining anchor text for query refinement

Proceedings of the 13th international conference on World Wide Web
Scaling link-based similarity search

WWW '05 Proceedings of the 14th international conference on World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a new ranking algorithm MFCRank for topic-specific Web search systems. The basic idea is to correlate two types of similarity information into a unified link analysis model so that the rich content and link features in Web collections can be exploited efficiently to improve the ranking performance. First, a new surfer model JBC is proposed, under which the topic similarity information among neighborhood pages is used to weigh the jumping probability of the surfer and to direct the surfing activities. Secondly, as JBC surfer model is still query-independent, a correlation between the query and JBC is essential. This is implemented by the definition of MFCRank score, which is the linear combination of JBC score and the similarity value between the query and the matched pages. Through the two correlation steps, the features contained in the plain text, link structure, anchor text and user query can be smoothly correlated in one single ranking model. Ranking experiments have been carried out on a set of topic-specific Web page collections. Experimental results showed that our algorithm gained great improvement with regard to the ranking precision.