BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Exact and approximation algorithms for clustering
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
ACM Computing Surveys (CSUR)
Locality metrics and program physical structures
Journal of Systems and Software - Special issue on software maintenance
Clustering spatial data using random walks
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Data Mining: An Overview from a Database Perspective
IEEE Transactions on Knowledge and Data Engineering
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Graph-Based Hierarchical Conceptual Clustering
Proceedings of the Thirteenth International Florida Artificial Intelligence Research Society Conference
STING: A Statistical Information Grid Approach to Spatial Data Mining
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Effective Graph Visualization Via Node Grouping
INFOVIS '01 Proceedings of the IEEE Symposium on Information Visualization 2001 (INFOVIS'01)
Semantics-guided clustering of heterogeneous XML schemas
Journal on data semantics IX
An approach for clustering semantically heterogeneous XML schemas
OTM'05 Proceedings of the 2005 Confederated international conference on On the Move to Meaningful Internet Systems - Volume >Part I
Hi-index | 0.00 |
Most current data clustering algorithms in data mining are based on a distance calculation in certain metric space. For Spatial Database Systems (SDBS), the Euclidean distance between two data points is often used to represent the relationship between data points. However, in some spatial settings and many other applications, distance alone is not enough to represent all the attributes of the relation between data points. We need a more powerful model to record more relational information between data objects. This paper adopts a graph model by which a database is regarded as a graph: each vertex of the graph represents a data point, and each edge, weighted or unweighted, is used to record the relation between two data points connected by the edge. Based on the graph model, this paper presents a set of cluster analysis criteria to guide data clustering. The criteria can be used to measure clustering results and help improving the quality of clustering. Further, a customizable algorithm using the criteria is proposed and implemented. This algorithm can produce clusters according to users' specifications. Preliminary experiments show encouraging results.