Automatic generation of overview timelines
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Event tracking based on domain dependency
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing
Communications of the ACM
Re-ranking model based on document clusters
Information Processing and Management: an International Journal
An evaluation method of words tendency depending on time-series variation and its improvements
Information Processing and Management: an International Journal
Topic-conditioned novelty detection
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
NLP and IR approaches to monolingual and multilingual link detection
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Hi-index | 0.00 |
In this paper, we propose a novel approach for multilingual story link detection. Our approach utilized the distributional features of terms in timelines and multilingual spaces, together with selected types of named entities in order to get distinctive weights for terms that constitute linguistic representation of events. On timelines term significance is calculated by comparing term distribution of the documents on a day with that of the total document collection. Since two languages can provide more information than one language, term significance is measured on each language space, which is then used as a bridge between two languages on multilingual spaces. Evaluating the method on Korean and Japanese news articles, our method achieved 14.3% improvement for monolingual story pairs, and 16.7% improvement for multilingual story pairs. By measuring the space density, the proposed weighting components are verified with a high density of the intra-event stories and a low density of the inter-events stories. This result indicates that the proposed method is helpful for multilingual story link detection.