Client-system collaboration for legal corpus selection in an online production environment

Authors:
Jack G. Conrad;Joanne R. S. Claussen
Affiliations:
Thomson Legal & Regulatory, St. Paul, Minnesota;Thomson-West, St. Paul, Minnesota
Venue:
ICAIL '03 Proceedings of the 9th international conference on Artificial intelligence and law
Year:
2003

Citing 19
Cited 0

Inference networks for document retrieval

Inference networks for document retrieval
A system for discovering relationships by feature extraction from text databases

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The effectiveness of GIOSS for the text database discovery problem

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Searching distributed collections with inference networks

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Users lost (summary): reflections on the past, future, and limits of information science

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Effective retrieval with distributed collections

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Methods for information server selection

ACM Transactions on Information Systems (TOIS)
Usability, user preferences, effectiveness, and user behaviors when searching individual and integrated full-text databases: implications for digital libraries

Journal of the American Society for Information Science
Overview of the sixth text REtrieval conference (TREC-6)

Information Processing and Management: an International Journal - The sixth text REtrieval conference (TREC-6)
A user-centered design approach to personalization

Communications of the ACM
Helping people find what they don't know

Communications of the ACM
Evaluating evaluation measure stability

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
The impact of database selection on distributed searching

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Collection selection and results merging with topically organized U.S. patents and TREC data

Proceedings of the ninth international conference on Information and knowledge management
Query-based sampling of text databases

ACM Transactions on Information Systems (TOIS)
A cognitive approach to judicial opinion structure: applying domain expertise to component analysis

Proceedings of the 8th international conference on Artificial intelligence and law
Mercator: A scalable, extensible Web crawler

World Wide Web
Pharos: A Scalable Distributed Architecture for Locating Heterogeneous Information Sources

Pharos: A Scalable Distributed Architecture for Locating Heterogeneous Information Sources
Database selection using actual physical and acquired logical collection resources in a massive domain-specific operational environment

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

The continued growth of very large data environments such as Westlaw and Dialog, in addition to the World Wide Web, increases the importance of effective and efficient database selection and searching. Current research focuses largely on completely autonomous and automatic selection, searching, and results merging in distributed environments. This fully automatic approach has significant deficiencies, including reliance upon thresholds below which databases with relevant documents are not searched (compromised recall). It also merges result sets, often from disparate data sources that users may have discarded before their source selection task proceeded (diluted precision). We examine the impact that early user interaction can have on the process of database selection. After analyzing thousands of real user queries, we show that precision can be significantly increased when queries are categorized by the users themselves, then interpreted and treated accurately by the system. Such query categorization strategies may eliminate limitations of fully automated query processing approaches. Our system harnesses the WIN search engine, a sibling to INQUERY, run against one or more authority sources when search is required. We compare our approach to one that does not recognize or utilize distinct features associated with user queries. We show that by avoiding a one-size-fits-all approach that restricts the role users can play in information discovery, database selection effectiveness can be appreciably improved.