Enabling data retrieval: by ranking and beyond

  • Authors:
  • Kevin Chen-Chuan Chang;Chengkai Li

  • Affiliations:
  • University of Illinois at Urbana-Champaign;University of Illinois at Urbana-Champaign

  • Venue:
  • Enabling data retrieval: by ranking and beyond
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The ubiquitous usage of databases for managing structured data, compounded with the expanded reach of the Internet to end users, has brought forward new scenarios of data retrieval. Users often want to express non-traditional fuzzy queries with soft criteria, in contrast to Boolean queries, and to explore what choices are available in databases and how they match the query criteria. Conventional database management systems (DBMS s) have become increasingly inadequate for such new scenarios.Towards enabling data retrieval, this thesis first studies how to fundamentally integrate ranking into databases. We built RankSQL, a DBMS that provides systematic and principled support of ranking queries. With a new ranking algebra and an extended query optimizer for the algebra, RankSQL captures ranking as a first-class construct in databases, together with traditional Boolean constructs. We invented efficient techniques for answering ad-hoc ranking aggregate queries. RankSQL provides significant performance improvement over current DBMSs in processing ranking queries and ranking aggregate queries.This thesis further studies how to enable retrieval mechanisms beyond just ranking. Our explorative study in this direction is exemplified by two novel proposals—One is to integrate clustering and ranking of database query results; the other is to support inverse ranking queries that provide ranks of objects in query context. Injecting such non-traditional facilities into databases presents non-trivial challenges in both defining query semantics and designing query processing methods. We extended SQL language to express such queries and invented partition- and summary-driven approaches to process them.