Modeling anchor text and classifying queries to enhance web document retrieval

Authors:
Atsushi Fujii
Affiliations:
University of Tsukuba, Tsukuba, Japan
Venue:
Proceedings of the 17th international conference on World Wide Web
Year:
2008

Citing 11
Cited 15

Presenting results of experimental retrieval comparisons

Information Processing and Management: an International Journal - Special issue on evaluation issues in information retrieval
Using statistical testing in the evaluation of retrieval experiments

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Translating collocations for bilingual lexicons: a statistical approach

Computational Linguistics
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Effective site finding using link anchor information

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A taxonomy of web search

ACM SIGIR Forum
Query type classification for web document retrieval

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Automatic identification of user goals in Web search

WWW '05 Proceedings of the 14th international conference on World Wide Web
Getting work done on the web: supporting transactional queries

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
The intention behind web queries

SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval

From Web 1.0 to Web 2.0 and back -: how did your grandma use to tag?

Proceedings of the 10th ACM workshop on Web information and data management
Building enriched document representations using aggregated anchor text

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Using anchor texts with their hyperlink structure for web search

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Enhancing Web Search by Aggregating Results of Related Web Queries

WISE '09 Proceedings of the 10th International Conference on Web Information Systems Engineering
On the relationship between trading network and WWW network: a preferential attachment perspective

International Journal of Business Intelligence and Data Mining
A content based approach for discovering missing anchor text for web search

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Semi-supervised ranking for document retrieval

Computer Speech and Language
Bridging link and query intent to enhance web search

Proceedings of the 22nd ACM conference on Hypertext and hypermedia
Incorporating web browsing activities into anchor texts for web search

Information Retrieval
A Survey of Automatic Query Expansion in Information Retrieval

ACM Computing Surveys (CSUR)
Mining anchor text trends for retrieval

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
A feature-free search query classification approach using semantic distance

Expert Systems with Applications: An International Journal
Using anchor text for homepage and topic distillation search tasks

Journal of the American Society for Information Science and Technology
Classifying web search queries to identify high revenue generating customers

Journal of the American Society for Information Science and Technology
Building enriched web page representations using link paths

Proceedings of the 23rd ACM conference on Hypertext and social media

Quantified Score

Hi-index	0.00

Visualization

Abstract

Several types of queries are widely used on the World Wide Web and the expected retrieval method can vary depending on the query type. We propose a method for classifying queries into informational and navigational types. Because terms in navigational queries often appear in anchor text for links to other pages, we analyze the distribution of query terms in anchor texts on the Web for query classification purposes. While content-based retrieval is effective for informational queries, anchor-based retrieval is effective for navigational queries. Our retrieval system combines the results obtained with the content-based and anchor-based retrieval methods, in which the weight for each retrieval result is determined automatically depending on the result of the query classification. We also propose a method for improving anchor-based retrieval. Our retrieval method, which computes the probability that a document is retrieved in response to the given query, identifies synonyms of query terms in the anchor texts on the Web and uses these synonyms for smoothing purposes in the probability estimation. We use the NTCIR test collections and show the effectiveness of individual methods and the entire Web retrieval system experimentally.