Extracting temporal equivalence relationships among keywords from time-stamped documents

  • Authors:
  • Parvathi Chundi;Mahadevan Subramaniam;R. M. Aruna Weerakoon

  • Affiliations:
  • University of Nebraska at Omaha, Omaha, NE;University of Nebraska at Omaha, Omaha, NE;University of Nebraska at Omaha, Omaha, NE

  • Venue:
  • DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Identifying keyword associations from text and search sources is often used to facilitate many tasks such as understanding relationships among concepts, extracting relevant documents, matching advertisements to web pages, expanding user queries, etc. However, these keyword associations change as the underlying content changes with time. Two keywords that are associated with each other during one time period may not be associated in another time period or the context under which these keywords are associated may be different. In this paper, we define an equivalence relationship among a pair of keywords and develop methods to construct a temporal view of the equivalence relationship. Given a document set D, a keyword a is associated with a context consisting of frequently occurring keyword sets (fs) of D in which a appears. Two keywords a and b are equivalent in D if their contexts are the same. We say that a and b are temporally equivalent in a time interval if a and b are equivalent in the documents published during that time interval. Given a time-stamped document set D published over a time period T, we define the temporal equivalence partitioning problem to construct a partitioning of the time period T into a sequence of maximal length time intervals such that in each time interval keywords a and b are either temporally equivalent or the equivalence relationship does not hold. A temporal equivalence partitioning of a document set for a given pair of keywords highlights all of the different contexts in which the given keywords are associated which can be used to generate time-varying keyword suggestions to users. We show the effectiveness of the approach by constructing the temporal equivalence partitionings of several pairs of keywords from the Multi-Domain Sentiment data set and the ICWSM 2009 Spinn3r data set.