Querying Wikipedia documents and relationships

  • Authors:
  • Huong Nguyen;Thanh Nguyen;Hoa Nguyen;Juliana Freire

  • Affiliations:
  • University of Utah;University of Utah;University of Utah;University of Utah

  • Venue:
  • Procceedings of the 13th International Workshop on the Web and Databases
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

Wikipedia has become an important source of information which is growing very rapidly. However, the existing infrastructure for querying this information is limited and often ignores the inherent structure in the information and links across documents. In this paper, we present a new approach for querying Wikipedia content that supports a simple, yet expressive query interfaces that allow both keyword and structured queries. A unique feature of our approach is that, besides returning documents that match the queries, it also exploits relationships among documents to return richer, multi-document answers. We model Wikipedia as a graph and cast the problem of finding answers for queries as graph search. To guide the answer-search process, we propose a novel weighting scheme to identify important nodes and edges in the graph. By leveraging the structured information available in infoboxes, our approach supports queries that specify constraints over this structure, and we propose a new search algorithm to support these queries. We evaluate our approach using a representative subset of Wikipedia documents and present results which show that our approach is effective and derives high-quality answers.