Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
A unifying review of linear Gaussian models
Neural Computation
CACTUS—clustering categorical data using summaries
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Finding generalized projected clusters in high dimensional spaces
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Scalability for clustering algorithms revisited
ACM SIGKDD Explorations Newsletter
Data bubbles: quality preserving performance boosting for hierarchical clustering
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
FREM: fast and robust EM clustering for large data sets
Proceedings of the eleventh international conference on Information and knowledge management
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
Data Mining and Knowledge Discovery
The LBG-U Method for Vector Quantization – an Improvement over LBGInspired from Neural Networks
Neural Processing Letters
A Fast Algorithm to Cluster High Dimensional Basket Data
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
C2P: Clustering based on Closest Pairs
Proceedings of the 27th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
ROCK: A Robust Clustering Algorithm for Categorical Attributes
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
On-line EM Algorithm for the Normalized Gaussian Network
Neural Computation
Cost-efficient mining techniques for data streams
ACSW Frontiers '04 Proceedings of the second workshop on Australasian information security, Data Mining and Web Intelligence, and Software Internationalisation - Volume 32
Horizontal aggregations for building tabular data sets
Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Efficient Disk-Based K-Means Clustering for Relational Databases
IEEE Transactions on Knowledge and Data Engineering
Programming the K-means clustering algorithm in SQL
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A model for association rules based on clustering
Proceedings of the 2005 ACM symposium on Applied computing
ACM SIGMOD Record
TCSOM: Clustering Transactions Using Self-Organizing Map
Neural Processing Letters
Integrating K-Means Clustering with a Relational DBMS Using SQL
IEEE Transactions on Knowledge and Data Engineering
A Bit Level Representation for Time Series Data Mining with Shape Based Similarity
Data Mining and Knowledge Discovery
Projected clustering for categorical datasets
Pattern Recognition Letters
Clicks: An effective algorithm for mining subspace clusters in categorical datasets
Data & Knowledge Engineering
Can exclusive clustering on streaming data be achieved?
ACM SIGKDD Explorations Newsletter
Supervised clustering of streaming data for email batch detection
Proceedings of the 24th international conference on Machine learning
Enhanced P2P services providing multimedia content
Advances in Multimedia
Exploratory data analysis leading towards the most interesting simple association rules
Computational Statistics & Data Analysis
A semi-random multiple decision-tree algorithm for mining data streams
Journal of Computer Science and Technology
Utilizing phrase-similarity measures for detecting and clustering informative RSS news articles
Integrated Computer-Aided Engineering
Generating Fuzzy Equivalence Classes on RSS News Articles for Retrieving Correlated Information
ICCSA '08 Proceedings of the international conference on Computational Science and Its Applications, Part II
Online pairing of VoIP conversations
The VLDB Journal — The International Journal on Very Large Data Bases
Models for association rules based on clustering and correlation
Intelligent Data Analysis
Scalable learning of collective behavior based on sparse social dimensions
Proceedings of the 18th ACM conference on Information and knowledge management
C-DenStream: Using Domain Knowledge on a Data Stream
DS '09 Proceedings of the 12th International Conference on Discovery Science
SCALE: a scalable framework for efficiently clustering transactional data
Data Mining and Knowledge Discovery
Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams
Proceedings of the 2010 conference on Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams
Mining fuzzy frequent itemsets for hierarchical document clustering
Information Processing and Management: an International Journal
MG-join: detecting phenomena and their correlation in high dimensional data streams
Distributed and Parallel Databases
Increasing availability of industrial systems through data stream mining
Computers and Industrial Engineering
Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication
A clustering algorithm for multiple data streams based on spectral component similarity
Information Sciences: an International Journal
Two-dimensional clustering algorithms for image segmentation
WSEAS Transactions on Computers
Kalman filters and adaptive windows for learning in data streams
DS'06 Proceedings of the 9th international conference on Discovery Science
Clustering large datasets using cobweb and k-means in tandem
AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
Socialized ubiquitous personal study: Toward an individualized information portal
Journal of Computer and System Sciences
Multimedia Tools and Applications
Clustering cubes with binary dimensions in one pass
Proceedings of the sixteenth international workshop on Data warehousing and OLAP
On clustering large number of data streams
Intelligent Data Analysis
Hi-index | 0.00 |
Clustering data streams is an interesting Data Mining problem. This article presents three variants of the K-means algorithm to cluster binary data streams. The variants include On-line K-means, Scalable K-means, and Incremental K-means, a proposed variant introduced that finds higher quality solutions in less time. Higher quality of solutions are obtained with a mean-based initialization and incremental learning. The speedup is achieved through a simplified set of sufficient statistics and operations with sparse matrices. A summary table of clusters is maintained on-line. The K-means variants are compared with respect to quality of results and speed. The proposed algorithms can be used to monitor transactions.