Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
An Algorithm that Learns What‘s in a Name
Machine Learning - Special issue on natural language learning
Relevance based language models
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to topic detection and tracking
Topic detection and tracking
Query expansion using random walk models
Proceedings of the 14th ACM international conference on Information and knowledge management
Dynamic hyperparameter optimization for bayesian topical trend analysis
Proceedings of the 18th ACM conference on Information and knowledge management
ECIR'07 Proceedings of the 29th European conference on IR research
Topic tracking based on keywords dependency profile
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Story link detection based on event words
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
Modeling topical trends over continuous time with priors
ISNN'10 Proceedings of the 7th international conference on Advances in Neural Networks - Volume Part II
Expert Systems with Applications: An International Journal
Learning to explore spatio-temporal impacts for event evaluation on social media
ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part II
Hi-index | 0.00 |
Several information organization, access, and filtering systems can benefit from different kind of document representations than those used in traditional Information Retrieval (IR). Topic Detection and Tracking (TDT) is an example of such an application. In this paper we demonstrate that named entities serve as better choices of units for document representation over all words. In order to test this hypothesis we study the effect of words-based and entity-based representations on Story Link Detection (SLD) - a core task in TDT research. The experiments on TDT corpora show that entity-based representations give significant improvements for SLD. We also propose a mechanism to expand the set of named entities used for document representation, which enhances the performance in some cases. We then take a step further and analyze the limitations of using only named entities for the document representation. Our studies and experiments indicate that adding additional topical terms can help in addressing such limitations.