Topic detection and tracking in English and Chinese

  • Authors:
  • Charles L. Wayne

  • Affiliations:
  • Department of Defense, Ft. Meade, Maryland

  • Venue:
  • IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Topic Detection and Tracking (TDT) refers to automatic techniques for discovering, threading, and retrieving topically related material in streams of data. Newswire and broadcast news are the canonical sources. In 1999, TDT research was extended from English to Chinese, and carefully annotated multilingual corpora were created. Researchers devised clever approaches to the cross-language challenge, and formal performance evaluations yielded very promising results. This paper outlines the 1999 research tasks, corpora, evaluation procedures, technical approaches, and results. The multilingual, multimedia research and evaluations are continuing in 2000 and 2001 under the DARPA TIDES program.