E-discovery revisited: the need for artificial intelligence beyond information retrieval

  • Authors:
  • Jack G. Conrad

  • Affiliations:
  • Thomson Reuters Research and Development, Saint Paul, MN

  • Venue:
  • Artificial Intelligence and Law
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this work, we provide a broad overview of the distinct stages of E-Discovery. We portray them as an interconnected, often complex workflow process, while relating them to the general Electronic Discovery Reference Model (EDRM). We start with the definition of E-Discovery. We then describe the very positive role that NIST's Text REtrieval Conference (TREC) has added to the science of E-Discovery, in terms of the tasks involved and the evaluation of the legal discovery work performed. Given the critical nature that data analysis plays at various stages of the process, we present a pyramid model, which complements the EDRM model: for gathering and hosting; indexing; searching and navigating; and finally consolidating and summarizing E-Discovery findings. Next we discuss where the current areas of need and areas of growth appear to be, using one of the field's most authoritative surveys of providers and consumers of E-Discovery products and services. We subsequently address some areas of Artificial Intelligence, both Information Retrieval-related and not, which promise to make future contributions to the E-Discovery discipline. Some of these areas include data mining applied to e-mail and social networks, classification and machine learning, and the technologies that will enable next generation E-Discovery. The lesson we convey is that the more IR researchers and others understand the broader context of E-Discovery, including the stages that occur before and after primary search, the greater will be the prospects for broader solutions, creative optimizations and synergies yet to be tapped.