Entity-Relationship Queries over Wikipedia

  • Authors:
  • Xiaonan Li;Chengkai Li;Cong Yu

  • Affiliations:
  • University of Texas at Arlington;University of Texas at Arlington;Google Research

  • Venue:
  • ACM Transactions on Intelligent Systems and Technology (TIST)
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Wikipedia is the largest user-generated knowledge base. We propose a structured query mechanism, entity-relationship query, for searching entities in the Wikipedia corpus by their properties and interrelationships. An entity-relationship query consists of multiple predicates on desired entities. The semantics of each predicate is specified with keywords. Entity-relationship query searches entities directly over text instead of preextracted structured data stores. This characteristic brings two benefits: (1) Query semantics can be intuitively expressed by keywords; (2) It only requires rudimentary entity annotation, which is simpler than explicitly extracting and reasoning about complex semantic information before query-time. We present a ranking framework for general entity-relationship queries and a position-based Bounded Cumulative Model (BCM) for accurate ranking of query answers. We also explore various weighting schemes for further improving the accuracy of BCM. We test our ideas on a 2008 version of Wikipedia using a collection of 45 queries pooled from INEX entity ranking track and our own crafted queries. Experiments show that the ranking and weighting schemes are both effective, particularly on multipredicate queries.