Constrained data clustering by depth control and progressive constraint relaxation

Authors:
Bi-Ru Dai;Cheng-Ru Lin;Ming-Syan Chen
Affiliations:
Department of Electrical Engineering, National Taiwan University, ROC;Department of Electrical Engineering, National Taiwan University, ROC;Department of Electrical Engineering, National Taiwan University, ROC
Venue:
The VLDB Journal — The International Journal on Very Large Data Bases
Year:
2007

Citing 18
Cited 1

How many clusters are best?—an experiment

Pattern Recognition
Introduction to the theory of neural computation

Introduction to the theory of neural computation
BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Advances in knowledge discovery and data mining

Advances in knowledge discovery and data mining
Fast algorithms for projected clustering

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Discovering Internet marketing intelligence through online analytical web usage mining

ACM SIGMOD Record
Density biased sampling: an improved method for data mining and clustering

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques

Data mining: concepts and techniques
Pattern Recognition with Fuzzy Objective Function Algorithms

Pattern Recognition with Fuzzy Objective Function Algorithms
Data Mining: An Overview from a Database Perspective

IEEE Transactions on Knowledge and Data Engineering
Constrained Clustering as an Optimization Method

IEEE Transactions on Pattern Analysis and Machine Intelligence
From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Spatial Clustering in the Presence of Obstacles

Proceedings of the 17th International Conference on Data Engineering
Semantic Compression and Pattern Extraction with Fascicles

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
On Data Clustering Analysis: Scalability, Constraints, and Validation

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
A robust and efficient clustering algorithm based on cohesion self-merging

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Time-Constrained Clustering for Segmentation of Video into Story Unites

ICPR '96 Proceedings of the International Conference on Pattern Recognition (ICPR '96) Volume III-Volume 7276 - Volume 7276

An improved OLAP join and aggregate algorithm based on dimension hierarchy

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 5

Quantified Score

Hi-index	0.00

Visualization

Abstract

In order to import the domain knowledge or application-dependent parameters into the data mining systems, constraint-based mining has attracted a lot of research attention recently. In this paper, the attributes employed to model the constraints are called constraint attributes and those attributes involved in the objective function to be optimized are called optimization attributes. The constrained clustering considered in this paper is conducted in such a way that the objective function of optimization attributes is optimized subject to the condition that the imposed constraint is satisfied. Explicitly, we address the problem of constrained clustering with numerical constraints, in which the constraint attribute values of any two data items in the same cluster are required to be within the corresponding constraint range. This numerical constrained clustering problem, however, cannot be dealt with by any conventional clustering algorithms. Consequently, we devise several effective and efficient algorithms to solve such a clustering problem. It is noted that due to the intrinsic nature of the numerical constrained clustering, there is an order dependency on the process of attaining the clustering, which in many cases degrades the clustering results. In view of this, we devise a progressive constraint relaxation technique to remedy this drawback and improve the overall performance of clustering results. Explicitly, by using a smaller (tighter) constraint range in earlier iterations of merge, we will have more room to relax the constraint and seek for better solutions in subsequent iterations. It is empirically shown that the progressive constraint relaxation technique is able to improve not only the execution efficiency but also the clustering quality.