The search problem posed by large heterogeneous data sets in litigation: possible future approaches to research

  • Authors:
  • Jason R. Baron;Paul Thompson

  • Affiliations:
  • National Archives and Records Administration, College Park, MD;Dartmouth College, Hanover, NH

  • Venue:
  • Proceedings of the 11th international conference on Artificial intelligence and law
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Lawyers and their large institutional clients increasingly face the enormous problem of how to efficiently and efficaciously conduct searches for relevant documents in large heterogeneous electronic data sets, for the purpose of responding to litigation demands. Past research indicates that lawyers greatly overestimate their true rate of recall in civil discovery. The unprecedented size, scale, and complexity of electronically stored data now potentially subject to routine capture in litigation, for purpose of preservation, access, and review, presents information retrieval researchers with a series of important challenges to overcome. This paper describes the current context of e-discovery and discusses the potential for IR and AI research to address the challenges of conducting e-discovery. The TREC Legal Track is presented as a forum for the evaluation of e-discovery research and one new evaluation measure, elusion, is described, which has potential for addressing problems of measuring recall.