Client-system collaboration for legal corpus selection in an online production environment

  • Authors:
  • Jack G. Conrad;Joanne R. S. Claussen

  • Affiliations:
  • Thomson Legal & Regulatory, St. Paul, Minnesota;Thomson-West, St. Paul, Minnesota

  • Venue:
  • ICAIL '03 Proceedings of the 9th international conference on Artificial intelligence and law
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The continued growth of very large data environments such as Westlaw and Dialog, in addition to the World Wide Web, increases the importance of effective and efficient database selection and searching. Current research focuses largely on completely autonomous and automatic selection, searching, and results merging in distributed environments. This fully automatic approach has significant deficiencies, including reliance upon thresholds below which databases with relevant documents are not searched (compromised recall). It also merges result sets, often from disparate data sources that users may have discarded before their source selection task proceeded (diluted precision). We examine the impact that early user interaction can have on the process of database selection. After analyzing thousands of real user queries, we show that precision can be significantly increased when queries are categorized by the users themselves, then interpreted and treated accurately by the system. Such query categorization strategies may eliminate limitations of fully automated query processing approaches. Our system harnesses the WIN search engine, a sibling to INQUERY, run against one or more authority sources when search is required. We compare our approach to one that does not recognize or utilize distinct features associated with user queries. We show that by avoiding a one-size-fits-all approach that restricts the role users can play in information discovery, database selection effectiveness can be appreciably improved.