Models of incremental concept formation
Artificial Intelligence
The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Vector quantization and signal compression
Vector quantization and signal compression
Data clustering for very large datasets plus applications
Data clustering for very large datasets plus applications
Digital Image Compression: Algorithms and Standards
Digital Image Compression: Algorithms and Standards
Digital Image Compression Techniques
Digital Image Compression Techniques
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Experiments with Incremental Concept Formation: UNIMEM
Machine Learning
Knowledge Acquisition Via Incremental Conceptual Clustering
Machine Learning
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining on an OLTP system (nearly) for free
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A robust and scalable clustering algorithm for mixed type attributes in large database environment
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
E-business enterprise data mining
Tutorial notes of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Mining data streams under block evolution
ACM SIGKDD Explorations Newsletter
Alternatives to the k-means algorithm that find better clusterings
Proceedings of the eleventh international conference on Information and knowledge management
Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications
Data Mining and Knowledge Discovery
Change Detection in Overhead Imagery Using Neural Networks
Applied Intelligence
An Efficient k-Means Clustering Algorithm: Analysis and Implementation
IEEE Transactions on Pattern Analysis and Machine Intelligence
An Adaptive Flocking Algorithm for Spatial Clustering
PPSN VII Proceedings of the 7th International Conference on Parallel Problem Solving from Nature
A Visual Method of Cluster Validation with Fastmap
PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
M-FastMap: A Modified FastMap Algorithm for Visual Cluster Validation in Data Mining
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Performance Analysis of Database Systems
Performance Evaluation: Origins and Directions
A Cube Model and Cluster Analysis for Web Access Sessions
WEBKDD '01 Revised Papers from the Third International Workshop on Mining Web Log Data Across All Customers Touch Points
Data mining tasks and methods: Clustering: conceptual clustering
Handbook of data mining and knowledge discovery
A method for decentralized clustering in large multi-agent systems
AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Scalable Model-based Clustering by Working on Data Summaries
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A Monotonic On-Line Linear Algorithm for Hierarchical Agglomerative Classification
Information Technology and Management
On Decentralised Clustering in self-monitoring networks
Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Scalable Model-Based Clustering for Large Databases Based on Data Summarization
IEEE Transactions on Pattern Analysis and Machine Intelligence
Gradual Model Generator for Single-Pass Clustering
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Computing LTS Regression for Large Data Sets
Data Mining and Knowledge Discovery
A fast k-means implementation using coresets
Proceedings of the twenty-second annual symposium on Computational geometry
Fast Agglomerative Clustering Using a k-Nearest Neighbor Graph
IEEE Transactions on Pattern Analysis and Machine Intelligence
Gradual model generator for single-pass clustering
Pattern Recognition
Iterative shrinking method for clustering problems
Pattern Recognition
Towards higher disk head utilization: extracting free bandwidth from busy disk drives
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Hierarchical clustering of mixed data based on distance hierarchy
Information Sciences: an International Journal
Evolutionary model selection in unsupervised learning
Intelligent Data Analysis
Mining association rules using clustering
Intelligent Data Analysis
LEGClust—A Clustering Algorithm Based on Layered Entropic Subgraphs
IEEE Transactions on Pattern Analysis and Machine Intelligence
Cluster By: a new sql extension for spatial data aggregation
Proceedings of the 15th annual ACM international symposium on Advances in geographic information systems
Efficient clustering of databases induced by local patterns
Decision Support Systems
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Data Set Homeomorphism Transformation Based Meta-clustering
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
Finding Arbitrary Shaped Clusters for Character Recognition
ICIAR '08 Proceedings of the 5th international conference on Image Analysis and Recognition
Image-mapped data clustering: An efficient technique for clustering large data sets
Intelligent Data Analysis
Novelty detection with application to data streams
Intelligent Data Analysis - Knowledge Discovery from Data Streams
K-tree: large scale document clustering
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
A Cluster-Based Feature Selection Approach
HAIS '09 Proceedings of the 4th International Conference on Hybrid Artificial Intelligence Systems
Linear grouping using orthogonal regression
Computational Statistics & Data Analysis
Profiling Retail Web Site Functionalities and Conversion Rates: A Cluster Analysis
International Journal of Electronic Commerce
Generalized fuzzy C-means clustering algorithm with improved fuzzy partitions
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
SPARCL: an effective and efficient algorithm for mining arbitrary shape-based clusters
Knowledge and Information Systems
Subspace Discovery for Promotion: A Cell Clustering Approach
DS '09 Proceedings of the 12th International Conference on Discovery Science
Scalable model-based cluster analysis using clustering features
Pattern Recognition
Communication-Efficient Privacy-Preserving Clustering
Transactions on Data Privacy
A creditable subspace labeling method based on D-S evidence theory
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Quantization-based clustering algorithm
Pattern Recognition
Towards subspace clustering on dynamic data: an incremental version of PreDeCon
Proceedings of the First International Workshop on Novel Data Stream Pattern Mining Techniques
TI-DBSCAN: clustering with DBSCAN by means of the triangle inequality
RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
A neighborhood-based clustering by means of the triangle inequality
IDEAL'10 Proceedings of the 11th international conference on Intelligent data engineering and automated learning
Clustering-based geometric support vector machines
LSMS/ICSEE'10 Proceedings of the 2010 international conference on Life system modeling and simulation and intelligent computing, and 2010 international conference on Intelligent computing for sustainable energy and environment: Part II
Future Generation Computer Systems
A survey on clustering in data mining
Proceedings of the International Conference & Workshop on Emerging Trends in Technology
ClustCube: an OLAP-based framework for clustering and mining complex database objects
Proceedings of the 2011 ACM Symposium on Applied Computing
Computers and Industrial Engineering
INFORMS Journal on Computing
A unique property of single-link distance and its application in data clustering
Data & Knowledge Engineering
Efficient mining of emerging events in a dynamic spatiotemporal environment
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
On convergence of dynamic cluster formation in multi-agent networks
ECAL'05 Proceedings of the 8th European conference on Advances in Artificial Life
HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
iDISQUE: tuning high-dimensional similarity queries in DHT networks
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part I
Improving k-means by outlier removal
SCIA'05 Proceedings of the 14th Scandinavian conference on Image Analysis
Clustering large dynamic datasets using exemplar points
MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition
Towards adaptive clustering in self-monitoring multi-agent networks
KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
Predicting cluster formation in decentralized sensor grids
KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III
Streaming data reduction using low-memory factored representations
Information Sciences: an International Journal
A BIRCH-Based clustering method for large time series databases
PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
A computational study of a nonlinear minsum facility location problem
Computers and Operations Research
CAMEUD: clustering approach for mining evolving usage data
Proceedings of the Ninth International Workshop on Information Integration on the Web
Enhanced clustering of complex database objects in the clustcube framework
Proceedings of the fifteenth international workshop on Data warehousing and OLAP
Towards hierarchical clustering
CSR'07 Proceedings of the Second international conference on Computer Science: theory and applications
TRES-CORE: content-based retrieval based on the balanced tree in peer to peer systems
PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies
Knowledge augmentation via incremental clustering: new technology for effective knowledge management
International Journal of Business Information Systems
Accelerating non-local denoising with a patch based dictionary
Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing
Warped K-Means: An algorithm to cluster sequentially-distributed data
Information Sciences: an International Journal
Clustering local frequency items in multiple databases
Information Sciences: an International Journal
Clustering based on a near neighbor graph and a grid cell graph
Journal of Intelligent Information Systems
Data stream clustering: A survey
ACM Computing Surveys (CSUR)
EvenTweet: online localized event detection from twitter
Proceedings of the VLDB Endowment
Local learning integrating global structure for large scale semi-supervised classification
Computers & Mathematics with Applications
Hyperspherical cluster based distributed anomaly detection in wireless sensor networks
Journal of Parallel and Distributed Computing
Analysing microarray expression data through effective clustering
Information Sciences: an International Journal
Survey of Clustering: Algorithms and Applications
International Journal of Information Retrieval Research
Hi-index | 0.00 |
Data clustering is an important technique for exploratory dataanalysis, and has been studied for several years. It has been shownto be useful in many practical domains such as data classificationand image processing. Recently, there has been a growing emphasis onexploratory analysis of very large datasets todiscover useful patterns and/or correlations among attributes. This is called data mining, and data clustering is regarded as a particular branch.However existing data clustering methods do not adequately addressthe problem of processing large datasets with a limited amount ofresources (e.g., memory and cpu cycles). So as the dataset sizeincreases, they do not scale up well in terms of memory requirement,running time, and result quality.In this paper, an efficient and scalable data clustering method isproposed, based on a new in-memory data structure called CF-tree, which serves as an in-memory summary of the datadistribution. We have implemented it in a system called BIRCH(Balanced Iterative Reducing and Clustering using Hierarchies), andstudied its performance extensively in terms of memory requirements,running time, clustering quality, stability and scalability; we alsocompare it with other available methods. Finally, BIRCH is appliedto solve two real-life problems: one is building an iterative andinteractive pixel classification tool, and the other is generatingthe initial codebook for image compression.