Evaluation of information retrieval for E-discovery

Authors:
Douglas W. Oard;Jason R. Baron;Bruce Hedin;David D. Lewis;Stephen Tomlinson
Affiliations:
College of Information Studies and Institute for Advanced Computer Studies, University of Maryland, College Park, MD;Office of the General Counsel, College Park, MD;H5, San Francisco, CA;David D. Lewis Consulting, Chicago, IL;Open Text Corporation, Ottawa, ON, Canada
Venue:
Artificial Intelligence and Law
Year:
2010

Citing 31
Cited 7

An evaluation of retrieval effectiveness for a full-text document-retrieval system

Communications of the ACM
Information retrieval interaction

Information retrieval interaction
Effects of OCR errors on ranking and feedback using the vector space model

Information Processing and Management: an International Journal
Efficient construction of large test collections

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
How reliable are the results of large-scale information retrieval experiments?

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
European Research Letter: cross-language system evaluation: the CLEF campaigns

Journal of the American Society for Information Science and Technology
A report on the first year of the INitiative for the evaluation of XML retrieval (INEX'02)

Journal of the American Society for Information Science and Technology
Retrieval evaluation with incomplete information

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Forming test collections with no system pooling

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
The TREC terabyte retrieval track

ACM SIGIR Forum
Information retrieval system evaluation: effort, sensitivity, and reliability

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
The Turn: Integration of Information Seeking and Retrieval in Context (The Information Retrieval Series)

The Turn: Integration of Information Seeking and Retrieval in Context (The Information Retrieval Series)
Wittgenstein, Language and Information: "Back to the Rough Ground!" (Information Science and Knowledge Management)

Wittgenstein, Language and Information: "Back to the Rough Ground!" (Information Science and Knowledge Management)
User performance versus precision measures for simple search tasks

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
What makes a query difficult?

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A statistical method for system evaluation using incomplete judgments

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Bias and the limits of pooling

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Building a test collection for complex document information processing

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Estimating average precision with incomplete and imperfect judgments

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
The search problem posed by large heterogeneous data sets in litigation: possible future approaches to research

Proceedings of the 11th international conference on Artificial intelligence and law
Reliable information retrieval evaluation with incomplete and biased judgements

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A comparison of pooled and sampled relevance judgments

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to the NTCIR-6 Special Issue

ACM Transactions on Asian Language Information Processing (TALIP)
On information retrieval metrics designed for evaluation with incomplete relevance assessments

Information Retrieval
Evaluation over thousands of queries

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Text collections for FIRE

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Rank-biased precision for measurement of retrieval effectiveness

ACM Transactions on Information Systems (TOIS)
Document categorization in legal electronic discovery: computer classification vs. manual review

Journal of the American Society for Information Science and Technology
Impedance matching of humans * machines in high-Q information retrieval systems

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Replication and automation of expert judgments: information engineering in legal E-discovery

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics

Emerging AI & law approaches to automating analysis and retrieval of electronically stored information in discovery proceedings

Artificial Intelligence and Law
E-discovery revisited: the need for artificial intelligence beyond information retrieval

Artificial Intelligence and Law
Afterword: data, knowledge, and e-discovery

Artificial Intelligence and Law
Using personality to create alliances in group recommender systems

ICCBR'11 Proceedings of the 19th international conference on Case-Based Reasoning Research and Development
A utility-theoretic ranking method for semi-automated text classification

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Towards minimizing the annotation cost of certified text classification

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
When to stop reviewing documents in eDiscovery cases: the Lit i View quality monitor and endpoint detector

Proceedings of the Fifth International Conference on Management of Emergent Digital EcoSystems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The effectiveness of information retrieval technology in electronic discovery (E-discovery) has become the subject of judicial rulings and practitioner controversy. The scale and nature of E-discovery tasks, however, has pushed traditional information retrieval evaluation approaches to their limits. This paper reviews the legal and operational context of E-discovery and the approaches to evaluating search technology that have evolved in the research community. It then describes a multi-year effort carried out as part of the Text Retrieval Conference to develop evaluation methods for responsive review tasks in E-discovery. This work has led to new approaches to measuring effectiveness in both batch and interactive frameworks, large data sets, and some surprising results for the recall and precision of Boolean and statistical information retrieval methods. The paper concludes by offering some thoughts about future research in both the legal and technical communities toward the goal of reliable, effective use of information retrieval in E-discovery.