Trends Analysis of Topics Based on Temporal Segmentation

Authors:
Wei Chen;Parvathi Chundi
Affiliations:
University of Nebraska at Omaha, NE, Omaha, US 68182;University of Nebraska at Omaha, NE, Omaha, US 68182
Venue:
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Year:
2009

Citing 10
Cited 1

Subtopic structuring for full-length document access

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval

Modern Information Retrieval
On the need for time series data mining benchmarks: a survey and empirical demonstration

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Bursty and Hierarchical Structure in Streams

Data Mining and Knowledge Discovery
OSSM: A Segmentation Approach to Optimize Frequency Counting

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
An algorithm for one-page summarization of a long text based on thematic hierarchy detection

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Discovering evolutionary theme patterns from text: an exploration of temporal text mining

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Spatial scan statistics: approximations and performance study

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining correlated bursty topic patterns from coordinated text streams

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Fuzzy Clustering for Topic Analysis and Summarization of Document Collections

CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence

Extracting hot spots of topics from time-stamped documents

Data & Knowledge Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Extracting interesting information from large unstructured document sets is a time consuming task. In this paper, we describe an approach to analyze the temporal trends of a given topic in a time-stamped document set based on time series segmentation. We consider topics containing multiple keywords and use a fuzzy set based method to compute a numeric value to measure the relevance of a document set to the given topic. The measure of relevance is then used to assign a discrepancy score to a segmentation of the time period associated with the document set. The discrepancy score of a segmentation represents the likelihood of the topic across all segments in a segmentation. Given a user specified value k , we then define a min different k segmentation to capture the k -segmentation with the maximum possible discrepancy score and describe a dynamic-programming based algorithm to compute it. The proposed approach is illustrated by several experiments using a subset of the TDT-Pilot Corpus data set. Our experiments show that the min difference k segmentation successfully highlights the temporal trends of a topic using k segments.