On-line new event detection and tracking
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
ACM Computing Surveys (CSUR)
Research to Improve Cross-Language Retrieval - Position Paper for CLEF
CLEF '00 Revised Papers from the Workshop of Cross-Language Evaluation Forum on Cross-Language Information Retrieval and Evaluation
Geographical information recognition and visualization in texts written in various languages
Proceedings of the 2004 ACM symposium on Applied computing
Multilingual document clustering: an heuristic approach based on cognate named entities
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Multilingual news clustering: Feature translation vs. identification of cognate named entities
Pattern Recognition Letters
A Latent Semantic Indexing-based approach to multilingual document clustering
Decision Support Systems
Similarity of Names Across Scripts: Edit Distance Using Learned Costs of N-Grams
GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Feature-based method for document alignment in comparable news corpora
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Story tracking: linking similar news over time and across languages
MMIES '08 Proceedings of the Workshop on Multi-source Multilingual Information Extraction and Summarization
Multilingual spectral clustering using document similarity propagation
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
New event detection and topic tracking in Turkish
Journal of the American Society for Information Science and Technology
Bilingual news clustering using named entities and fuzzy similarity
TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
Measuring Chinese-English cross-lingual word similarity with HowNet and parallel corpus
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
Disambiguating entity references within an ontological model
Proceedings of the International Conference on Web Intelligence, Mining and Semantics
Double-pass clustering technique for multilingual document collections
Journal of Information Science
Multilingual news document clustering: two algorithms based on cognate named entities
TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
Entity reference resolution via spreading activation on RDF-Graphs
ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part I
Hi-index | 0.00 |
We are presenting a working system for automated news analysis that ingests an average total of 7600 news articles per day in five languages. For each language, the system detects the major news stories of the day using a group-average unsupervised agglomerative clustering process. It also tracks, for each cluster, related groups of articles published over the previous seven days, using a cosine of weighted terms. The system furthermore tracks related news across languages, in all language pairs involved. The cross-lingual news cluster similarity is based on a linear combination of three types of input: (a) cognates, (b) automatically detected to geographical place names and (c) the results of a mapping process onto a multilingual classification system. A manual evaluation showed that the system produces good results.