Early user---system interaction for database selection in massive domain-specific online environments

  • Authors:
  • Jack G. Conrad;Joanne R. S. Claussen

  • Affiliations:
  • Thomson Legal & Regulatory, St. Paul, Minnesota;West Group

  • Venue:
  • ACM Transactions on Information Systems (TOIS)
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The continued growth of very large data environments such as Westlaw and Dialog, in addition to the World Wide Web, increases the importance of effective and efficient database selection and searching. Current research focuses largely on completely autonomous and automatic selection, searching, and results merging in distributed environments. This fully automatic approach has significant deficiencies, including reliance upon thresholds below which databases with relevant documents are not searched (compromised recall). It also merges documents, often from disparate data sources that users may have discarded before their source selection task proceeded (diluted precision). We examine the impact that early user interaction can have on the process of database selection. After analyzing thousands of real user queries, we show that precision can be significantly increased when queries are categorized by the users themselves, then handled effectively by the system. Such query categorization strategies may eliminate limitations of fully automated query processing approaches. Our system harnesses the WIN search engine, a sibling to INQUERY, run against one or more authority sources when search is required. We compare our approach to one that does not recognize or utilize distinct features associated with user queries. We show that by avoiding a one-size-fits-all approach that restricts the role users can play in information discovery, database selection effectiveness can be appreciably improved.