Dynamic pattern mining: an incremental data clustering approach

  • Authors:
  • Seokkyung Chung;Dennis McLeod

  • Affiliations:
  • Department of Computer Science, and Integrated Media System Center, University of Southern California, Los Angeles, California;Department of Computer Science, and Integrated Media System Center, University of Southern California, Los Angeles, California

  • Venue:
  • Journal on Data Semantics II
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a mining framework that supports the identification of useful patterns based on incremental data clustering. Given the popularity of Web news services, we focus our attention on news streams mining. News articles are retrieved from Web news services, and processed by data mining tools to produce useful higher-level knowledge, which is stored in a content description database. Instead of interacting with a Web news service directly, by exploiting the knowledge in the database, an information delivery agent can present an answer in response to a user request. A key challenging issue within news repository management is the high rate of document insertion. To address this problem, we present a sophisticated incremental hierarchical document clustering algorithm using a neighborhood search. The novelty of the proposed algorithm is the ability to identify meaningful patterns (e.g., news events, and news topics) while reducing the amount of computations by maintaining cluster structure incrementally. In addition, to overcome the lack of topical relations in conceptual ontologies, we propose a topic ontology learning framework that utilizes the obtained document hierarchy. Experimental results demonstrate that the proposed clustering algorithm produces high-quality clusters, and a topic ontology provides interpretations of news topics at different levels of abstraction.