Anchor text extraction for academic search

  • Authors:
  • Shuming Shi;Fei Xing;Mingjie Zhu;Zaiqing Nie;Ji-Rong Wen

  • Affiliations:
  • Microsoft Research Asia;Alibaba Group, China;University of Science and Technology of China;Microsoft Research Asia;Microsoft Research Asia

  • Venue:
  • NLPIR4DL '09 Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Anchor text plays a special important role in improving the performance of general Web search, due to the fact that it is relatively objective description for a Web page by potentially a large number of other Web pages. Academic Search provides indexing and search functionality for academic articles. It may be desirable to utilize anchor text in academic search as well to improve the search results quality. The main challenge here is that no explicit URLs and anchor text is available for academic articles. In this paper we define and automatically assign a pseudo-URL for each academic article. And a machine learning approach is adopted to extract pseudo-anchor text for academic articles, by exploiting the citation relationship between them. The extracted pseudo-anchor text is then indexed and involved in the relevance score computation of academic articles. Experiments conducted on 0.9 million research papers show that our approach is able to dramatically improve search performance.