Document detection overview

  • Authors:
  • Donna Harman

  • Affiliations:
  • National Institute of Standards and Technology, Gaithersburg, MD

  • Venue:
  • TIPSTER '93 Proceedings of a workshop on held at Fredericksburg, Virginia: September 19-23, 1993
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

The goal of the document detection half of the TIPSTER project was to significantly advance the state of the art in effective document detection from large, real-world document collections. This document detection needed to be used in both the routing environment (static queries against a constant stream of new data) and the adhoc environment (new queries against archival data). An additional requirement was that the algorithms for these tasks be as domain and language independent as possible. To demonstrate language independence, the project was done both in Japanese and English. To demonstrate domain independence, the test collection was selected to cover many different subject areas and different document structures.