A study of retrospective and on-line event detection
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
On-line new event detection and tracking
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Cross-language information access to multilingual collections on the internet
Journal of the American Society for Information Science - digital libraries: Part 1
Proper name translation in cross-language information retrieval
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
A stochastic finite-state word-segmentation algorithm for Chinese
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
A multilingual news summarizer
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Identification and classification of proper nouns in Chinese texts
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Resolving translation ambiguity and target polysemy in cross-language information retrieval
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Building a Chinese-English wordnet for translingual applications
ACM Transactions on Asian Language Information Processing (TALIP)
Language-specific models in multilingual topic tracking
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
NLP and IR approaches to monolingual and multilingual link detection
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Translating–transliterating named entities for multilingual information access
Journal of the American Society for Information Science and Technology
Cross-document event clustering using knowledge mining from co-reference chains
Information Processing and Management: an International Journal - Special issue: AIRS2005: Information retrieval research in Asia
Temporal feature modification for retrospective categorization
FeatureEng '05 Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing
Double-pass clustering technique for multilingual document collections
Journal of Information Science
Evaluation of an interactive topic detection and tracking interface
Journal of Information Science
Hi-index | 0.00 |
This paper presents algorithms for Chinese and English-Chinese topic detection. Named entities, other nouns and verbs are cue patterns to relate news stories describing the same event. Lexical translation and name transliteration resolve lexical differences between English and Chinese. A two-threshold scheme determines relevance (irrelevance) between a news story and a topic cluster. Lookahead information deals with ambiguous cases in clustering. The least-recently-used removal strategy models the time factor in such a way that older and unimportant terms will have no effect on clustering. Experimental results show that nouns and verbs as well as the least-recently-used removal strategy outperform other models. The performance of the named-entity-only approach decreases slightly, but it has no overhead of nouns-and-verbs approach with the least-recently-used removal strategy.