Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic combination of multiple ranked retrieval systems
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Combining multiple evidence from different properties of weighting schemes
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Combining classifiers in text categorization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
A study of retrospective and on-line event detection
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Boosting and Rocchio applied to text filtering
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Statistical Models for Text Segmentation
Machine Learning - Special issue on natural language learning
Improving text categorization methods for event tracking
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Learning Approaches for Detecting and Tracking News Events
IEEE Intelligent Systems
Multistrategy Learning for Information Extraction
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Combining Multiple Learning Strategies for Effective Cross Validation
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Capturing term dependencies using a language model based on sentence trees
Proceedings of the eleventh international conference on Information and knowledge management
Robust techniques for organizing and retrieving spoken documents
EURASIP Journal on Applied Signal Processing
Discovering event evolution graphs from news corpora
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
New event detection and topic tracking in Turkish
Journal of the American Society for Information Science and Technology
Tracing the event evolution of terror attacks from on-line news
ISI'06 Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics
Hi-index | 0.00 |
This chapter reports on CMU's work in all the five TDT-1999 tasks, including segmentation (story boundary identification), topic tracking, topic detection, first story detection, and story-link detection. We have addressed these tasks as supervised or unsupervised classification problems, and applied a variety of statistical learning algorithms to each problem for comparison. For segmentation we used exponential language models and decision trees; for topic tracking we used primarily k-nearest-neighbors classification (also language models, decision trees and a variant of the Rocchio approach); for topic detection we used a combination of incremental clustering and agglomerative hierarchical clustering, and for first story detection and story link detection we used a cosine-similarity based measure. We also studied the effect of combining the output of alternative methods for producing joint classification decisions in topic tracking. We found that a combined use of multiple methods typically improved the classification of new topics when compared to using any single method. We examined our approaches with multi-lingual corpora, including stories in English, Mandarin and Spanish, and multi-media corpora consisting of newswire texts and the results of automated speech recognition for broadcast news sources. The methods worked reasonably well under all of the above conditions.