A Query Language and Its Processing for Time-Series Document Clusters

Authors:
Sophoin Khy;Yoshiharu Ishikawa;Hiroyuki Kitagawa
Affiliations:
Graduate School of Systems and Information Engineering, University of Tsukuba,;Information Technology Center, Nagoya University,;Graduate School of Systems and Information Engineering, University of Tsukuba, and Center for Computation Sciences, University of Tsukuba,
Venue:
ICADL 08 Proceedings of the 11th International Conference on Asian Digital Libraries: Universal and Ubiquitous Access to Information
Year:
2008

Citing 10
Cited 0

Scatter/Gather: a cluster-based approach to browsing large document collections

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Topic Detection and Tracking: Event-Based Information Organization

Topic Detection and Tracking: Event-Based Information Organization
Event threading within news topics

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Discovering evolutionary theme patterns from text: an exploration of temporal text mining

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
NewsInEssence: summarizing online news topics

Communications of the ACM - The digital society
MONIC: modeling and monitoring cluster transitions

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
A Novelty-based Clustering Method for On-line Documents

World Wide Web
C-TREND: Temporal Cluster Graphs for Identifying and Visualizing Trends in Multiattribute Transactional Data

IEEE Transactions on Knowledge and Data Engineering
T-Scroll: visualizing trends in a time-series of documents for interactive user exploration

ECDL'07 Proceedings of the 11th European conference on Research and Advanced Technology for Digital Libraries

Quantified Score

Hi-index	0.00

Visualization

Abstract

Document clustering methods for time-series documents produce a sequence of snapshots of clustering results over time. Analyzing the contents (topics) and trends in a long sequence of clustering snapshots is hard and requires efforts since there are too many number of clusters; a user may need to access every cluster or read every document contained in each cluster. In this paper, we propose a framework to find clusters of user interest and change patterns called transition patterns involving the clusters. A cluster in a clustering result may persist in another cluster, branch into more than one cluster, merge with other clusters to form one cluster, or disappear in the adjacent clustering result. This research aims at providing users facilities to retrieve specific transition patterns in the clustering results. For this purpose, we propose a query language for time-series document clustering results and an approach to query processing. The first experimental results on TDT2 corpus clustering results are presented.