Exploiting locality of Wikipedia links in entity ranking

Authors:
Jovan Pehcevski;Anne-Marie Vercoustre;James A. Thom
Affiliations:
INRIA, Rocquencourt, France;INRIA, Rocquencourt, France;RMIT University, Melbourne, Australia
Venue:
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Year:
2008

Citing 8
Cited 12

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment

Journal of the ACM (JACM)
Knowledge-based extraction of named entities

Proceedings of the eleventh international conference on Information and knowledge management
Block-level link analysis

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Hybrid XML Retrieval: Combining Information Retrieval and a Native XML Database

Information Retrieval
The Wikipedia XML corpus

ACM SIGIR Forum
Topical link analysis for web search

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
ESTER: efficient search on text, entities, and relations

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

Overview of the INEX 2007 Entity Ranking Track

Focused Access to XML Documents
Using Wikipedia Categories and Links in Entity Ranking

Focused Access to XML Documents
A Vector Space Model for Ranking Entities and Its Application to Expert Search

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Boosting a Semantic Search Engine by Named Entities

ISMIS '09 Proceedings of the 18th International Symposium on Foundations of Intelligent Systems
Entity ranking in Wikipedia: utilising categories, links and topic difficulty prediction

Information Retrieval
Why finding entities in Wikipedia is difficult, sometimes

Information Retrieval
Ranking entities using web search query logs

ECDL'10 Proceedings of the 14th European conference on Research and advanced technology for digital libraries
Voting for related entities

RIAO '10 Adaptivity, Personalization and Fusion of Heterogeneous Information
ReFER: effective relevance feedback for entity ranking

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Query modeling for entity search based on terms, categories, and examples

ACM Transactions on Information Systems (TOIS)
Expertise Retrieval

Foundations and Trends in Information Retrieval
Collaboratively built semi-structured content and Artificial Intelligence: The story so far

Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Information retrieval from web and XML document collections ever more focused on returning entities instead of web pages or XML elements. There are many research fields involving named entities; one such field is known as entity ranking, where one goal is to rank entities in response to a query supported with a short list of entity examples. In this paper, we describe our approach to ranking entities from the Wikipedia XML document collection. Our approach utilises the known categories and the link structure of Wikipedia, and more importantly, exploits link co-occurrences to improve the effectiveness of entity ranking. Using the broad context of a full Wikipedia page as a baseline, we evaluate two different algorithms for identifying narrow contexts around the entity examples: one that uses predefined types of elements such as paragraphs, lists and tables; and another that dynamically identifies the contexts by utilising the underlying XML document structure. Our experiments demonstrate that the locality of Wikipedia links can be exploited to significantly improve the effectiveness of entity ranking.