Approximate data mining in very large relational data

Authors:
James C. Bezdek;Richard J. Hathaway;Christopher Leckie;Ramamohanarao Kotagiri
Affiliations:
Department of Computer Science, University of West Florida, Pensacola, FL;Department of Mathematical Sciences, Georgia Southern University, Statesboro, GA;Department of Computer Science and Software Engineering, University of Melbourne, Victoria, Australia;Department of Computer Science and Software Engineering, University of Melbourne, Victoria, Australia
Venue:
ADC '06 Proceedings of the 17th Australasian Database Conference - Volume 49
Year:
2006

Citing 8
Cited 2

Fuzzy Models and Algorithms for Pattern Recognition and Image Processing

Fuzzy Models and Algorithms for Pattern Recognition and Image Processing
Efficient and Effective Clustering Methods for Spatial Data Mining

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Clustering Large Datasets in Arbitrary Metric Spaces

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Convergence of alternating optimization

Neural, Parallel & Scientific Computations
Approximate clustering in very large relational data: Research Articles

International Journal of Intelligent Systems
Probabilistic classification and clustering in relational data

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
bigVAT: Visual assessment of cluster tendency for large data sets

Pattern Recognition
Complexity reduction for "large image" processing

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Topographic mapping of large dissimilarity data sets

Neural Computation
Clustering very large dissimilarity data sets

ANNPR'10 Proceedings of the 4th IAPR TC3 conference on Artificial Neural Networks in Pattern Recognition

Quantified Score

Hi-index	0.01

Visualization

Abstract

In this paper we discuss eNERF, an extended version of non-Euclidean relational fuzzy c-means (NERFCM) for approximate clustering in very large (unloadable) relational data. The eNERF procedure consists of four parts: (i) selection of distinguished features by algorithm DF to be monitored during progressive sampling; (ii) progressively sampling a square N×N relation matrix RN by algorithm PS until an n×n sample relation Rn passes a goodness of fit test; (iii) Clustering Rn using algorithm LNERF; and (iv), extension of the LNERF results to RN-Rn by algorithm xNERF, which uses an iterative procedure based on LNERF to compute fuzzy membership values for all of the objects remaining after LNERF clustering of the accepted sample. Three of the four algorithms are new - only LNERF (called NERFCM in the original literature) precedes this article.