Trends Analysis of Topics Based on Temporal Segmentation

  • Authors:
  • Wei Chen;Parvathi Chundi

  • Affiliations:
  • University of Nebraska at Omaha, NE, Omaha, US 68182;University of Nebraska at Omaha, NE, Omaha, US 68182

  • Venue:
  • DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Extracting interesting information from large unstructured document sets is a time consuming task. In this paper, we describe an approach to analyze the temporal trends of a given topic in a time-stamped document set based on time series segmentation. We consider topics containing multiple keywords and use a fuzzy set based method to compute a numeric value to measure the relevance of a document set to the given topic. The measure of relevance is then used to assign a discrepancy score to a segmentation of the time period associated with the document set. The discrepancy score of a segmentation represents the likelihood of the topic across all segments in a segmentation. Given a user specified value k , we then define a min different k segmentation to capture the k -segmentation with the maximum possible discrepancy score and describe a dynamic-programming based algorithm to compute it. The proposed approach is illustrated by several experiments using a subset of the TDT-Pilot Corpus data set. Our experiments show that the min difference k segmentation successfully highlights the temporal trends of a topic using k segments.