Improving realism of topic tracking evaluation
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Flexible intrinsic evaluation of hierarchical clustering for TDT
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Simple Semantics in Topic Detection and Tracking
Information Retrieval
A month to topic detection and tracking in Hindi
ACM Transactions on Asian Language Information Processing (TALIP)
Forming test collections with no system pooling
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Building an information retrieval test collection for spontaneous conversational speech
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Investigations on event evolution in TDT
NAACLstudent '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Proceedings of the HLT-NAACL 2003 student research workshop - Volume 3
Evaluation of resources for question answering evaluation
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Building a reusable test collection for question answering
Journal of the American Society for Information Science and Technology - Research Articles
Relevance models for topic detection and tracking
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Information Processing and Management: an International Journal
New event detection and topic tracking in Turkish
Journal of the American Society for Information Science and Technology
Feeding the world: a comprehensive dataset and analysis of a real world snapshot of web feeds
Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services
Named entity patterns across news domains
FDIA'07 Proceedings of the 1st BCS IRSG conference on Future Directions in Information Access
Clustering in extreme learning machine feature space
Neurocomputing
Hi-index | 0.00 |
The TDT corpora, developed to support the DARPA-sponsored program in Topic Detection and Tracking, combine data collected over a nine month period from 8 English and 3 Chinese sources. The published corpora contain audio, reference text including written news text and transcripts of the broadcast audio, boundary tables segmenting the broadcasts into stories and relevance tables resulting from millions of human judgments. Sections of the corpora have undergone topic-story, first story and story link annotation. Both the TDT-2 and TDT-3 text corpora and the accompanying broadcast audio are now available from the Linguistic Data Consortium. This paper described the raw material collected for the corpora, the annotation of that material to prepare it for research use and the formats in which it is distributed. Special attention is paid to the quality control measures developed for these data sets.