SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
Incremental and interactive sequence mining
Proceedings of the eighth international conference on Information and knowledge management
Using syntactic dependency as local context to resolve word sense ambiguity
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Discovering relations among named entities from large corpora
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
PageRank on semantic networks, with application to word sense disambiguation
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
SITAC: discovering semantically identical temporally altering concepts in text archives
Proceedings of the 14th International Conference on Extending Database Technology
Mining semantics for culturomics: towards a knowledge-based approach
Proceedings of the 2013 international workshop on Mining unstructured big data using natural language processing
Hi-index | 0.00 |
Time-stamped documents such as newswire articles, blog posts and other web-pages are often archived online. When these archives cover long spans of time, the terminology within them could undergo significant changes. Hence, when users pose queries pertaining to historical information, over such documents, the queries need to be translated, taking into account these temporal changes, to provide accurate responses to users. For example, a query on Sri Lanka should automatically retrieve documents with its former name Ceylon. We call such concepts SITACs, i.e., Semantically Identical Temporally Altering Concepts. In order to discover SITACs, we propose an approach based on a novel framework constituting an integration of natural language processing, association rule mining, and contextual similarity as a learning technique. The proposed approach has been experimented with real data and has been found to yield good results with respect to efficiency and accuracy.