Algorithms for clustering data
Algorithms for clustering data
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
OPTICS: ordering points to identify the clustering structure
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Testing and spot-checking of data streams (extended abstract)
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Hancock: a language for extracting signatures from data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining high-speed data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Scalability for clustering algorithms revisited
ACM SIGKDD Explorations Newsletter
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
A framework for diagnosing changes in evolving data streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
An Intuitive Framework for Understanding Changes in Evolving Data Streams
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Streaming-Data Algorithms for High-Quality Clustering
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
A Human-Computer Interactive Method for Projected Clustering
IEEE Transactions on Knowledge and Data Engineering
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
On Change Diagnosis in Evolving Data Streams
IEEE Transactions on Knowledge and Data Engineering
ACM SIGMOD Record
Time weight collaborative filtering
Proceedings of the 14th ACM international conference on Information and knowledge management
Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams
Distributed and Parallel Databases
2005 Special Issue: Efficient streaming text clustering
Neural Networks - 2005 Special issue: IJCNN 2005
Adaptive Clustering for Multiple Evolving Streams
IEEE Transactions on Knowledge and Data Engineering
On biased reservoir sampling in the presence of stream evolution
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Adaptive non-linear clustering in data streams
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Can exclusive clustering on streaming data be achieved?
ACM SIGKDD Explorations Newsletter
Effective variation management for pseudo periodical streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Intelligent Data Analysis - Knowlegde Discovery from Data Streams
A Coding Hierarchy Computing Based Clustering Algorithm
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Clustering Massive Text Data Streams by Semantic Smoothing Model
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
E-Stream: Evolution-Based Technique for Stream Clustering
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Hierarchical, Parameter-Free Community Discovery
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Incremental clustering of dynamic data streams using connectivity based representative points
Data & Knowledge Engineering
A Scalable Framework For Segmenting Magnetic Resonance Images
Journal of Signal Processing Systems
Efficient layered density-based clustering of categorical data
Journal of Biomedical Informatics
A method for clustering transient data streams
Proceedings of the 2009 ACM symposium on Applied Computing
Efficiently Clustering Probabilistic Data Streams
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
PGG: an online pattern based approach for stream variation management
Journal of Computer Science and Technology
A framework for flexible clustering of multiple evolving data streams
International Journal of Advanced Intelligence Paradigms
Online pairing of VoIP conversations
The VLDB Journal — The International Journal on Very Large Data Bases
Measuring evolving data streams' behavior through their intrinsic dimension
New Generation Computing
A semi-supervised approach to projected clustering with applications to microarray data
International Journal of Data Mining and Bioinformatics
Clustering data stream: A survey of algorithms
International Journal of Knowledge-based and Intelligent Engineering Systems
Harnessing the strengths of anytime algorithms for constant data streams
Data Mining and Knowledge Discovery
Detecting Projected Outliers in High-Dimensional Data Streams
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Incremental and Adaptive Clustering Stream Data over Sliding Window
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
History Guided Low-Cost Change Detection in Streams
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Discovering pattern-based subspace clusters by pattern tree
Knowledge-Based Systems
HE-Tree: a framework for detecting changes in clustering structure for categorical data streams
The VLDB Journal — The International Journal on Very Large Data Bases
Communication-Efficient Privacy-Preserving Clustering
Transactions on Data Privacy
Leveraging web streams for contractual situational awareness in operational BI
Proceedings of the 2010 EDBT/ICDT Workshops
Outlier detection with streaming dyadic decomposition
ICDM'07 Proceedings of the 7th industrial conference on Advances in data mining: theoretical aspects and applications
Visualising the cluster structure of data streams
IDA'07 Proceedings of the 7th international conference on Intelligent data analysis
Connectivity based stream clustering using localised density exemplars
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
HDG-tree: a structure for clustering high-dimensional data streams
IITA'09 Proceedings of the 3rd international conference on Intelligent information technology application
Clustering high dimensional data streams with representative points
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
MG-join: detecting phenomena and their correlation in high dimensional data streams
Distributed and Parallel Databases
Towards subspace clustering on dynamic data: an incremental version of PreDeCon
Proceedings of the First International Workshop on Novel Data Stream Pattern Mining Techniques
Text stream clustering algorithm based on adaptive feature selection
Expert Systems with Applications: An International Journal
A framework for clustering categorical time-evolving data
IEEE Transactions on Fuzzy Systems
A recommender system based on tag and time information for social tagging systems
Expert Systems with Applications: An International Journal
A clustering algorithm based on matrix over high dimensional data stream
WISM'10 Proceedings of the 2010 international conference on Web information systems and mining
Research of fast SOM clustering for text information
Expert Systems with Applications: An International Journal
Tracing evolving clusters by subspace and value similarity
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Density based subspace clustering over dynamic data
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
LOGO: a long-short user interest integration in personalized news recommendation
Proceedings of the fifth ACM conference on Recommender systems
A clustering algorithm for multiple data streams based on spectral component similarity
Information Sciences: an International Journal
Memory-less unsupervised clustering for data streaming by versatile ellipsoidal function
Proceedings of the 20th ACM international conference on Information and knowledge management
DAPSS: exact subsequence matching for data streams
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Motion-Alert: automatic anomaly detection in massive moving objects
ISI'06 Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics
Generalized projected clustering in high-dimensional data streams
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
Granularity adaptive density estimation and on demand clustering of concept-drifting data streams
DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
Tracing Evolving Subspace Clusters in Temporal Climate Data
Data Mining and Knowledge Discovery
SIC-means: a semi-fuzzy approach for clustering data streams using c-means
ANNPR'10 Proceedings of the 4th IAPR TC3 conference on Artificial Neural Networks in Pattern Recognition
A grid-based clustering algorithm for high-dimensional data streams
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
A knowledge mining framework for business analysts
ACM SIGMIS Database
σ-SCLOPE: clustering categorical streams using attribute selection
KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
Information extraction, real-time processing and DW2.0 in operational business intelligence
DNIS'10 Proceedings of the 6th international conference on Databases in Networked Information Systems
A grid-based subspace clustering algorithm for high-dimensional data streams
WISE'06 Proceedings of the 7th international conference on Web Information Systems
HUE-Stream: evolution-based clustering technique for heterogeneous data streams with uncertainty
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
On fuzzy clustering of data streams with concept drift
ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part II
On resources optimization in fuzzy clustering of data streams
ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part II
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
A semi-supervised incremental clustering algorithm for streaming data
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
SOStream: self organizing density-based clustering over data stream
MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
Exclusive and complete clustering of streams
DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
A single pass trellis-based algorithm for clustering evolving data streams
DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
Density-Based projected clustering of data streams
SUM'12 Proceedings of the 6th international conference on Scalable Uncertainty Management
A survey on enhanced subspace clustering
Data Mining and Knowledge Discovery
On the equivalence of PLSI and projected clustering
ACM SIGMOD Record
Fast clustering-based anonymization approaches with time constraints for data streams
Knowledge-Based Systems
Proceedings of the 7th ACM international conference on Distributed event-based systems
Journal of Information Science
Data stream clustering: A survey
ACM Computing Surveys (CSUR)
Model-based clustering of high-dimensional data streams with online mixture of probabilistic PCA
Advances in Data Analysis and Classification
Online fuzzy medoid based clustering algorithms
Neurocomputing
Modeling and broadening temporal user interest in personalized news recommendation
Expert Systems with Applications: An International Journal
On clustering large number of data streams
Intelligent Data Analysis
Feature identification for topical relevance assessment in feed search engines
Intelligent Data Analysis
Semi-supervised projected model-based clustering
Data Mining and Knowledge Discovery
Hi-index | 0.01 |
The data stream problem has been studied extensively in recent years, because of the great ease in collection of stream data. The nature of stream data makes it essential to use algorithms which require only one pass over the data. Recently, single-scan, stream analysis methods have been proposed in this context. However, a lot of stream data is high-dimensional in nature. High-dimensional data is inherently more complex in clustering, classification, and similarity search. Recent research discusses methods for projected clustering over high-dimensional data sets. This method is however difficult to generalize to data streams because of the complexity of the method and the large volume of the data streams. In this paper, we propose a new, high-dimensional, projected data stream clustering method, called HPStream. The method incorporates a fading cluster structure, and the projection based clustering methodology. It is incrementally updatable and is highly scalable on both the number of dimensions and the size of the data streams, and it achieves better clustering quality in comparison with the previous stream clustering methods. Our performance study with both real and synthetic data sets demonstrates the efficiency and effectiveness of our proposed framework and implementation methods.