Matching Partitions over Time to Reliably Capture Local Clusters in Noisy Domains

  • Authors:
  • Frank Höppner;Mirko Böttcher

  • Affiliations:
  • University of Applied Sciences Braunschweig/Wolfenbüttel, Robert Koch Platz 10-14, D-38440 Wolfsburg, Germany;BT Group, Intelligent Systems Research Centre, Adastral Park, Orion Bldg. pp1/12, Ipswich, IP5 3RE, UK

  • Venue:
  • PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

When seeking for small clusters it is very intricate to distinguish between incidental agglomeration of noisy points and true local patterns. We present the PAMALOC algorithm that addresses this problem by exploiting temporal information which is contained in most business data sets. The algorithm enables the detection of local patterns in noisy data sets more reliable compared to the case when the temporal information is ignored. This is achieved by making use of the fact that noise does not reproduce its incidental structure but even small patterns do. In particular, we developed a method to track clusters over time based on an optimal match of data partitions between time periods.