Incorporating terminology evolution for query translation in text retrieval with association rules

  • Authors:
  • Amal C. Kaluarachchi;Aparna S. Varde;Srikanta Bedathur;Gerhard Weikum;Jing Peng;Anna Feldman

  • Affiliations:
  • Montclair State University, Montclair, NJ, USA;Montclair State University, Montclair, NJ, USA;Max Planck Institut für Informatik, Saarbrücken, Germany;Max Planck Institut für Informatik, Saarbrücken, Germany;Montclair State University, Montclair, NJ, USA;Montclair State University, Montclair, NJ, USA

  • Venue:
  • CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Time-stamped documents such as newswire articles, blog posts and other web-pages are often archived online. When these archives cover long spans of time, the terminology within them could undergo significant changes. Hence, when users pose queries pertaining to historical information, over such documents, the queries need to be translated, taking into account these temporal changes, to provide accurate responses to users. For example, a query on Sri Lanka should automatically retrieve documents with its former name Ceylon. We call such concepts SITACs, i.e., Semantically Identical Temporally Altering Concepts. In order to discover SITACs, we propose an approach based on a novel framework constituting an integration of natural language processing, association rule mining, and contextual similarity as a learning technique. The proposed approach has been experimented with real data and has been found to yield good results with respect to efficiency and accuracy.