BIRCH: an efficient data clustering method for very large databases

Authors:
Tian Zhang;Raghu Ramakrishnan;Miron Livny
Affiliations:
Computer Sciences Dept., Univ. of Wisconsin-Madison;Computer Sciences Dept., Univ. of Wisconsin-Madison;Computer Sciences Dept., Univ. of Wisconsin-Madison
Venue:
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Year:
1996

Citing 6
Cited 742

Vector quantization and signal compression

Vector quantization and signal compression
Experiments with Incremental Concept Formation: UNIMEM

Machine Learning
Knowledge Acquisition Via Incremental Conceptual Clustering

Machine Learning
Efficient and Effective Clustering Methods for Spatial Data Mining

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Knowledge Discovery in Large Spatial Databases: Focusing Techniques for Efficient Class Identification

SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
Parallel Algorithms for Hierarchical Clustering

Parallel Algorithms for Hierarchical Clustering

Efficiently supporting ad hoc queries in large datasets of time sequences

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
DEVise: integrated querying and visual exploration of large datasets

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Association rules over interval data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
DEVise (demo abstract): integrated querying and visual exploration of large datasets

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
GeoMiner: a system prototype for spatial data mining

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A spatial data mining method by Delaunay triangulation

GIS '97 Proceedings of the 5th ACM international workshop on Advances in geographic information systems
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A spatial data mining method by clustering analysis

Proceedings of the 6th ACM international symposium on Advances in geographic information systems
A framework for measuring changes in data characteristics

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
OPTICS: ordering points to identify the clustering structure

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Multi-dimensional selectivity estimation using compressed histogram information

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
WALRUS: a similarity retrieval algorithm for image databases

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Storing semistructured data with STORED

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Scalable algorithms for mining large databases

KDD '99 Tutorial notes of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering techniques for large data sets—from the past to the future

KDD '99 Tutorial notes of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Entropy-based subspace clustering for mining numerical data

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Compressed data cubes for OLAP aggregate query approximation on continuous dimensions

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Accelerating exact k-means algorithms with geometric reasoning

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Evaluating a class of distance-mapping algorithms for data mining and clustering

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Fast density estimation using CF-kernel for very large databases

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Hierarchical parallel coordinates for exploration of large datasets

VIS '99 Proceedings of the conference on Visualization '99: celebrating ten years
Data mining and the Web: past, present and future

Proceedings of the 2nd international workshop on Web information and data management
ACQ: an automatic clustering and querying approach for large image databases

MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 2)
Clustering transactions using large items

Proceedings of the eighth international conference on Information and knowledge management
A multiple-resolution method for edge-centric data clustering

Proceedings of the eighth international conference on Information and knowledge management
Data clustering: a review

ACM Computing Surveys (CSUR)
Finding generalized projected clusters in high dimensional spaces

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Density biased sampling: an improved method for data mining and clustering

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
SQLEM: fast clustering in SQL using the EM algorithm

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient clustering of high-dimensional data sets with application to reference matching

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Identifying prospective customers

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering through decision tree construction

Proceedings of the ninth international conference on Information and knowledge management
Information retrieval on the web

ACM Computing Surveys (CSUR)
Scalability for clustering algorithms revisited

ACM SIGKDD Explorations Newsletter
H-BLOB: a hierarchical visual clustering method using implicit surfaces

Proceedings of the conference on Visualization '00
Outlier detection for high dimensional data

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Data bubbles: quality preserving performance boosting for hierarchical clustering

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Distributed data clustering can be efficient and exact

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Tri-plots: scalable tools for multidimensional data mining

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient discovery of error-tolerant frequent itemsets in high dimensions

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Mining top-n local outliers in large databases

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Mining user session data to facilitate user interaction with a customer service knowledge base in RightNow Web

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Using navigation data to improve IR functions in the context of web search

Proceedings of the tenth international conference on Information and knowledge management
Finding similar images quicky using object shapes

Proceedings of the tenth international conference on Information and knowledge management
Automatic architectual clustering of software

Advances in software engineering
Theory of keyblock-based image retrieval

ACM Transactions on Information Systems (TOIS)
Requirements for clustering data streams

ACM SIGKDD Explorations Newsletter
Genetic subtyping using cluster analysis

ACM SIGKDD Explorations Newsletter
Accelerating EM for Large Databases

Machine Learning
Approximate XML joins

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Clustering by pattern similarity in large data sets

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
A Monte Carlo algorithm for fast projective clustering

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
An evaluation of sampling methods for data mining with fuzzy C-means

Data mining for design and manufacturing
Why so many clustering algorithms: a position paper

ACM SIGKDD Explorations Newsletter
Data declustering and cluster ordering technique for spatial join scheduling

Information organization and databases
COOLCAT: an entropy-based algorithm for categorical clustering

Proceedings of the eleventh international conference on Information and knowledge management
FREM: fast and robust EM clustering for large data sets

Proceedings of the eleventh international conference on Information and knowledge management
Hyper-rectangle based segmentation and clustering of large video data sets

Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Intelligent multimedia computing and networking
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values

Data Mining and Knowledge Discovery
Likelihood-Based Data Squashing: A Modeling Approach to Instance Construction

Data Mining and Knowledge Discovery
A Multi-Resolution Content-Based Retrieval Approach for Geographic Images

Geoinformatica
Multi-Level Clustering and its Visualization for Exploratory Spatial Analysis

Geoinformatica
Squeezer: an efficient algorithm for clustering categorical data

Journal of Computer Science and Technology
Parallel Mining of Outliers in Large Database

Distributed and Parallel Databases
On Clustering Validation Techniques

Journal of Intelligent Information Systems
Clustering High Dimensional Massive Scientific Datasets

Journal of Intelligent Information Systems
A Decision Criterion for the Optimal Number of Clusters in Hierarchical Clustering

Journal of Global Optimization
Structure-Based Brushes: A Mechanism for Navigating Hierarchically Organized Data and Information Spaces

IEEE Transactions on Visualization and Computer Graphics
HD-Eye: Visual Mining of High-Dimensional Data

IEEE Computer Graphics and Applications
Discovery Visualization Using Fast Clustering

IEEE Computer Graphics and Applications
Mining Very Large Databases

Computer
Data Mining: An Overview from a Database Perspective

IEEE Transactions on Knowledge and Data Engineering
Adaptive Prefetching and Storage Reorganization In A Log-Structured Storage System

IEEE Transactions on Knowledge and Data Engineering
Data Resource Selection in Distributed Visual Information Systems

IEEE Transactions on Knowledge and Data Engineering
An Approach to Active Spatial Data Mining Based on Statistical Information

IEEE Transactions on Knowledge and Data Engineering
DEMON: Mining and Monitoring Evolving Data

IEEE Transactions on Knowledge and Data Engineering
Finding Localized Associations in Market Basket Data

IEEE Transactions on Knowledge and Data Engineering
Redefining Clustering for High-Dimensional Applications

IEEE Transactions on Knowledge and Data Engineering
Clustering for Approximate Similarity Search in High-Dimensional Spaces

IEEE Transactions on Knowledge and Data Engineering
SemQuery: Semantic Clustering and Querying on Heterogeneous Features for Visual Data

IEEE Transactions on Knowledge and Data Engineering
CLARANS: A Method for Clustering Objects for Spatial Data Mining

IEEE Transactions on Knowledge and Data Engineering
On distributing the clustering process

Pattern Recognition Letters
Findout: finding outliers in very large datasets

Knowledge and Information Systems
Fast hierarchical clustering and its validation

Data & Knowledge Engineering
Non-convex clustering using expectation maximization algorithm with rough set initialization

Pattern Recognition Letters - Special issue: Rough sets, pattern recognition and data mining
DynDex: a dynamic and non-metric space indexer

Proceedings of the tenth ACM international conference on Multimedia
Using Projections to Visually Cluster High-Dimensional Data

Computing in Science and Engineering
Approximation algorithms for clustering to minimize the sum of diameters

Nordic Journal of Computing
Constraint-based clustering in large databases

ICDT '01 Proceedings of the 8th International Conference on Database Theory
Mining for Empty Rectangles in Large Data Sets

ICDT '01 Proceedings of the 8th International Conference on Database Theory
Context-Based Similarity Measures for Categorical Databases

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Accurate Recasting of Parameter Estimation Algorithms Using Sufficient Statistics for Efficient Parallel Speed-Up: Demonstrated for Center-Based Data Clustering Algorithms

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Fast Hierarchical Clustering Based on Compressed Data and OPTICS

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
A Study on the Hierarchical Data Clustering Algorithm Based on Gravity Theory

PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
Iterative Data Squashing for Boosting Based on a Distribution-Sensitive Distance

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Multiscale Comparison of Temporal Patternsin Time-Series Medical Databases

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Incremental Clustering for Mining in a Data Warehousing Environment

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Clustering Categorical Data: An Approach Based on Dynamical Systems

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Algorithms for Mining Distance-Based Outliers in Large Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Efficiently Computing Weighted Proximity Relationships in Spatial Databases

WAIM '01 Proceedings of the Second International Conference on Advances in Web-Age Information Management
Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Semantic Compression and Pattern Extraction with Fascicles

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
INSITE: A Tool for Interpreting Users? Interaction with a Web Space

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
The 3W Model and Algebra for Unified Data Mining

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Indexing the Distance: An Efficient Method to KNN Processing

Proceedings of the 27th International Conference on Very Large Data Bases
C2P: Clustering based on Closest Pairs

Proceedings of the 27th International Conference on Very Large Data Bases
Approximation Algorithms for Clustering to Minimize the Sum of Diameters

SWAT '00 Proceedings of the 7th Scandinavian Workshop on Algorithm Theory
AUTOCLUST+: Automatic Clustering of Point-Data Sets in the Presence of Obstacles

TSDM '00 Proceedings of the First International Workshop on Temporal, Spatial, and Spatio-Temporal Data Mining-Revised Papers
Value Range Queries on Earth Science Data via Histogram Clustering

TSDM '00 Proceedings of the First International Workshop on Temporal, Spatial, and Spatio-Temporal Data Mining-Revised Papers
Categorizing Visitors Dynamically by Fast and Robust Clustering of Access Logs

WI '01 Proceedings of the First Asia-Pacific Conference on Web Intelligence: Research and Development
CBCM: A Cell-Based Clustering Method for Data Mining Applications

WAIM '02 Proceedings of the Third International Conference on Advances in Web-Age Information Management
Affinity-Based Probabilistic Reasoning and Document Clustering on the WWW

COMPSAC '00 24th International Computer Software and Applications Conference
Revisiting R-Tree Construction Principles

ADBIS '02 Proceedings of the 6th East European Conference on Advances in Databases and Information Systems
Pattern-Oriented Hierachical Clustering

ADBIS '99 Proceedings of the Third East European Conference on Advances in Databases and Information Systems
Implementing Data Mining in a DBMS

BNCOD 19 Proceedings of the 19th British National Conference on Databases: Advances in Databases
Analysis of Accuracy of Data Reduction Techniques

DaWaK '99 Proceedings of the First International Conference on Data Warehousing and Knowledge Discovery
Partitioning Algorithms for the Computation of Average Iceberg Queries

DaWaK 2000 Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery
Vmhist: Efficient Multidimensional Histograms with Improved Accuracy

DaWaK 2000 Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery
Interactive Clustering for Transaction Data

DaWaK '01 Proceedings of the Third International Conference on Data Warehousing and Knowledge Discovery
Outlier Detection Using Replicator Neural Networks

DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
An Efficient K -Medoids-Based Algorithm Using Previous Medoid Index, Triangular Inequality Elimination Criteria, and Partial Distance Search

DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
CoFD: An Algorithm for Non-distance Based Clustering in High Dimensional Spaces

DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Self-Tuning Clustering: An Adaptive Clustering Method for Transaction Data

DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Fully Dynamic Clustering of Metric Data Sets

BNCOD 19 Proceedings of the 19th British National Conference on Databases: Advances in Databases
Scaling-Up Model-Based Clustering Algorithm by Working on Clustering Features

IDEAL '02 Proceedings of the Third International Conference on Intelligent Data Engineering and Automated Learning
Data Squashing for Speeding Up Boosting-Based Outlier Detection

ISMIS '02 Proceedings of the 13th International Symposium on Foundations of Intelligent Systems
ODMQL: Object Data Mining Query Language

Proceedings of the International Symposium on Objects and Databases
A Fast Algorithm for Density-Based Clustering in Large Database

PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
Robust Clustering of Large Geo-referenced Data Sets

PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
An Efficient Space-Partitioning Based Algorithm for the K-Means Clustering

PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
Data Mining Techniques for Associations, Clustering and Classification

PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
COE: Clustering with Obstacles Entities. A Preliminary Study

PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Criteria on Proximity Graphs for Boundary Extraction and Spatial Clustering

PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
Efficient Hierarchical Clustering Algorithms Using Partially Overlapping Partitions

PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
Scalable Hierarchical Clustering Method for Sequences of Categorical Values

PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
Enhancing Effectiveness of Outlier Detections for Low Density Patterns

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
On Data Clustering Analysis: Scalability, Constraints, and Validation

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Efficiently Mining Gene Expression Data via Integrated Clustering and Validation Techniques

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Data Mining and Personalization Technologies

DASFAA '99 Proceedings of the Sixth International Conference on Database Systems for Advanced Applications
Summary Grids: Building Accurate Multidimensional Histograms

DASFAA '99 Proceedings of the Sixth International Conference on Database Systems for Advanced Applications
Efficiently Matching Proximity Relationships in Spatial Databases

SSD '99 Proceedings of the 6th International Symposium on Advances in Spatial Databases
A Semantic Model for Hypertext Data Caching

ER '02 Proceedings of the 21st International Conference on Conceptual Modeling
Subspace Clustering Based on Compressibility

DS '02 Proceedings of the 5th International Conference on Discovery Science
Mining Clusters with Association Rules

IDA '99 Proceedings of the Third International Symposium on Advances in Intelligent Data Analysis
A Data-Clustering Algorithm on Distributed Memory Multiprocessors

Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
A Generalization-Based Approach to Clustering of Web Usage Sessions

WEBKDD '99 Revised Papers from the International Workshop on Web Usage Analysis and User Profiling
A Framework for Efficient and Anonymous Web Usage Mining Based on Client-Side Tracking

WEBKDD '01 Revised Papers from the Third International Workshop on Mining Web Log Data Across All Customers Touch Points
DROLAP - A Dense-Region Based Approach to On-Line Analytical Processing

DEXA '99 Proceedings of the 10th International Conference on Database and Expert Systems Applications
Declustering Spatial Objects by Clustering for Parallel Disks

DEXA '01 Proceedings of the 12th International Conference on Database and Expert Systems Applications
COFE: A Scalable Method for Feature Extraction from Complex Objects

DaWaK 2000 Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery
RecTree: An Efficient Collaborative Filtering Method

DaWaK '01 Proceedings of the Third International Conference on Data Warehousing and Knowledge Discovery
Data Structures for Minimization of Total Within-Group Distance for Spatio-temporal Clustering

PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
STING: A Statistical Information Grid Approach to Spatial Data Mining

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Finding Dense Clusters in Hyperspace: An Approach Based on Row Shuffling

WAIM '01 Proceedings of the Second International Conference on Advances in Web-Age Information Management
An Incremental Hierarchical Data Clustering Algorithm Based on Gravity Theory

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Collective, Hierarchical Clustering from Distributed, Heterogeneous Data

Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
Lower dimensional representation of text data in vector space based information retrieval

Computational information retrieval
Clustering large unstructured document sets

Computational information retrieval
Clustering categorical data: an approach based on dynamical systems

The VLDB Journal — The International Journal on Very Large Data Bases
Distance-based outliers: algorithms and applications

The VLDB Journal — The International Journal on Very Large Data Bases
WaveCluster: a wavelet-based clustering approach for spatial data in very large databases

The VLDB Journal — The International Journal on Very Large Data Bases
A survey on wavelet applications in data mining

ACM SIGKDD Explorations Newsletter
Maintaining variance and k-medians over data stream windows

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
An expectation-maximization algorithm working on data summary

Second international workshop on Intelligent systems design and application
SyMP: an efficient clustering approach to identify clusters of arbitrary shapes in large data sets

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A robust and efficient clustering algorithm based on cohesion self-merging

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
CLOPE: a fast and effective clustering algorithm for transactional data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering Data Streams: Theory and Practice

IEEE Transactions on Knowledge and Data Engineering
P-AutoClass: Scalable Parallel Clustering for Mining Large Data Sets

IEEE Transactions on Knowledge and Data Engineering
Rightnow eservice center: internet customer service using a self-learning knowledge base

Eighteenth national conference on Artificial intelligence
Data mining tasks and methods: spatial analysis

Handbook of data mining and knowledge discovery
Connectionist and evolutionary models for learning, discovering and forecasting software effort

Managing data mining technologies in organizations
Mining and monitoring evolving data

Handbook of massive data sets
A unified approach for mining outliers

CASCON '97 Proceedings of the 1997 conference of the Centre for Advanced Studies on Collaborative research
Mining for empty spaces in large data sets

Theoretical Computer Science - Database theory
Effective Management of Hierarchical Storage Using Two Levels of Data Clustering

MSS '03 Proceedings of the 20 th IEEE/11 th NASA Goddard Conference on Mass Storage Systems and Technologies (MSS'03)
A bibliography of temporal, spatial and spatio-temporal data mining research

ACM SIGKDD Explorations Newsletter
A Scalable Parallel Subspace Clustering Algorithm for Massive Data Sets

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Efficiently Detecting Arbitrary Shaped Clusters in Image Databases

ICTAI '99 Proceedings of the 11th IEEE International Conference on Tools with Artificial Intelligence
Interactive Data Analysis on Numeric-Data

IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
Navigating Hierarchies with Structure-Based Brushes

INFOVIS '99 Proceedings of the 1999 IEEE Symposium on Information Visualization
Clustering in very large databases based on distance and density

Journal of Computer Science and Technology
PHC: a fast partition and hierarchy-based clustering algorithm

Journal of Computer Science and Technology
Clustering binary data streams with K-means

DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Clustering gene expression data in SQL using locally adaptive metrics

DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Enabling Personalized Recommendation on the Web Based on User Interests and Behaviors

RIDE '01 Proceedings of the 11th International Workshop on research Issues in Data Engineering
Intelligent Web mining

Intelligent exploration of the web
Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets

IEEE Transactions on Knowledge and Data Engineering
Conceptual Clustering of Heterogeneous GeneExpression Sequences

Artificial Intelligence Review
Web Usage Mining as a Tool for Personalization: A Survey

User Modeling and User-Adapted Interaction
OP-Cluster: Clustering by Tendency in High Dimensional Space

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
TECNO-STREAMS: Tracking Evolving Clusters in Noisy Data Streams with a Scalable Immune System Learning Model

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Prototype-based mining of numeric data streams

Proceedings of the 2003 ACM symposium on Applied computing
A customizable hybrid approach to data clustering

Proceedings of the 2003 ACM symposium on Applied computing
Classifying large data sets using SVMs with hierarchical clusters

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
New unsupervised clustering algorithm for large datasets

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Approximate searches: k-neighbors + precision

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Efficient data mining for calling path patterns in GSM networks

Information Systems
GraphZip: a fast and automatic compression method for spatial data clustering

Proceedings of the 2004 ACM symposium on Applied computing
WALRUS: A Similarity Retrieval Algorithm for Image Databases

IEEE Transactions on Knowledge and Data Engineering
A Human-Computer Interactive Method for Projected Clustering

IEEE Transactions on Knowledge and Data Engineering
ItCompress: An Iterative Semantic Compression Algorithm

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
LDC: Enabling Search By Partial Distance In A Hyper-Dimensional Space

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Leaders-subleaders: an efficient hierarchical clustering algorithm for large data sets

Pattern Recognition Letters
Coordinating computational and visual approaches for interactive feature selection and multivariate clustering

Information Visualization - Special issue on coordinated and multiple views in exploratory visualization
Outlier analysis for gene expression data

Journal of Computer Science and Technology - Special issue on bioinformatics
Adaptive Neural Network Clustering of Web Users

Computer
Hypergraph Models and Algorithms for Data-Pattern-Based Clustering

Data Mining and Knowledge Discovery
Learning in Dynamic Decision Making: The Recognition Process

Computational & Mathematical Organization Theory
Space-efficient cubes for OLAP range-sum queries

Decision Support Systems
Cost-based labeling of groups of mass spectra

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Approximate XML query answers

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Clustering objects on a spatial network

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Incremental and effective data summarization for dynamic hierarchical clustering

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Information-theoretic tools for mining database structure from large data sets

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Automatic categorization of query results

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
MAIDS: mining alarming incidents from data streams

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
A k-Median Algorithm with Running Time Independent of Data Size

Machine Learning
Document clustering via adaptive subspace iteration

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient Disk-Based K-Means Clustering for Relational Databases

IEEE Transactions on Knowledge and Data Engineering
Diagonal Ordering: a new approach to high-dimensional KNN processing

ADC '04 Proceedings of the 15th Australasian database conference - Volume 27
Fully automatic cross-associations

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
On demand classification of data streams

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering moving objects

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining scale-free networks using geodesic clustering

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Programming the K-means clustering algorithm in SQL

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A top-down approach for density-based clustering using multidimensional indexes

Journal of Systems and Software - Special issue: Performance modeling and analysis of computer systems and networks
An Efficient Mining and Clustering Algorithm for Interactive Walk-Through Traversal Patterns

WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
Simulating the Behaviour of Electronic MarketPlaces with an Agent-Based Approach

WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
A Survey of Outlier Detection Methodologies

Artificial Intelligence Review
ClusterMap: labeling clusters in large datasets via visualization

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Combining Partitional and Hierarchical Algorithms for Robust and Efficient Data Clustering with Cohesion Self-Merging

IEEE Transactions on Knowledge and Data Engineering
Classification and knowledge discovery in protein databases

Journal of Biomedical Informatics - Special issue: Biomedical machine learning
Clustering in Dynamic Spatial Databases

Journal of Intelligent Information Systems
Architecture for knowledge discovery and knowledge management

Knowledge and Information Systems
Projective Clustering by Histograms

IEEE Transactions on Knowledge and Data Engineering
Antipole Tree Indexing to Support Range Search and K-Nearest Neighbor Search in Metric Spaces

IEEE Transactions on Knowledge and Data Engineering
An adjustable algorithm for color quantization

Pattern Recognition Letters
k-means projective clustering

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
An effective and efficient algorithm for high-dimensional outlier detection

The VLDB Journal — The International Journal on Very Large Data Bases
Research issues in automatic database clustering

ACM SIGMOD Record
Array-index: a plug&search K nearest neighbors method for high-dimensional data

Data & Knowledge Engineering
A database clustering methodology and tool

Information Sciences—Informatics and Computer Science: An International Journal
Tree-based clustering for gene expression data

Proceedings of the 2005 ACM symposium on Applied computing
A novel grammar-based genetic programming approach to clustering

Proceedings of the 2005 ACM symposium on Applied computing
The role of visualization in effective data cleaning

Proceedings of the 2005 ACM symposium on Applied computing
iDistance: An adaptive B+-tree based indexing method for nearest neighbor search

ACM Transactions on Database Systems (TODS)
Automatic Subspace Clustering of High Dimensional Data

Data Mining and Knowledge Discovery
GCHL: A grid-clustering algorithm for high-dimensional very large spatial data bases

Pattern Recognition Letters
Combining linear programming and clustering techniques for the classification of research centers

AI Communications
VISTA: validating and refining clusters via visualization

Information Visualization
Using retrieval measures to assess similarity in mining dynamic web clickstreams

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Streaming pattern discovery in multiple time-series

VLDB '05 Proceedings of the 31st international conference on Very large data bases
A Shrinking-Based Clustering Approach for Multidimensional Data

IEEE Transactions on Knowledge and Data Engineering
Knowledge discovery by probabilistic clustering of distributed databases

Data & Knowledge Engineering
Clustering high-dimensional data using an efficient and effective data space reduction

Proceedings of the 14th ACM international conference on Information and knowledge management
Efficiently Mining Gene Expression Data via a Novel Parameterless Clustering Method

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Making SVMs Scalable to Large Data Sets using Hierarchical Cluster Indexing

Data Mining and Knowledge Discovery
Parameter-Free Spatial Data Mining Using MDL

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
CLUMP: A Scalable and Robust Framework for Structure Discovery

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Integrating K-Means Clustering with a Relational DBMS Using SQL

IEEE Transactions on Knowledge and Data Engineering
Enhancing Data Analysis with Noise Removal

IEEE Transactions on Knowledge and Data Engineering
A parallel hybrid web document clustering algorithm and its performance study

The Journal of Supercomputing - Special issue: Parallel and distributed processing and applications
A Framework for On-Demand Classification of Evolving Data Streams

IEEE Transactions on Knowledge and Data Engineering
Maxdiff kd-trees for data condensation

Pattern Recognition Letters
Integrating XML data sources using approximate joins

ACM Transactions on Database Systems (TODS)
Hypothesis oriented cluster analysis in data mining by visualization

Proceedings of the working conference on Advanced visual interfaces
Construction of query concepts based on feature clustering of documents

Information Retrieval
QROCK: A quick version of the ROCK algorithm for clustering of categorical data

Pattern Recognition Letters
Adherence clustering: an efficient method for mining market-basket clusters

Information Systems
MPM: a hierarchical clustering algorithm using matrix partitioning method for non-numeric data

Journal of Intelligent Information Systems
Two-phase clustering strategy for gene expression data sets

Proceedings of the 2006 ACM symposium on Applied computing
A framework for resource-aware knowledge discovery in data streams: a holistic approach with its application to clustering

Proceedings of the 2006 ACM symposium on Applied computing
Graph-based synopses for relational selectivity estimation

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
PENS: an algorithm for density-based clustering in peer-to-peer systems

InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
iVIBRATE: Interactive visualization-based framework for clustering large datasets

ACM Transactions on Information Systems (TOIS)
Robust information-theoretic clustering

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering based large margin classification: a scalable approach using SOCP formulation

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Finding centric local outliers in categorical/numerical spaces

Knowledge and Information Systems
LinkClus: efficient clustering via heterogeneous semantic links

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Representation for multiple classified data

DBA'06 Proceedings of the 24th IASTED international conference on Database and applications
An incremental network for on-line unsupervised classification and topology learning

Neural Networks
Adaptive non-linear clustering in data streams

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Processing relaxed skylines in PDMS using distributed data summaries

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Towards interactive indexing for large Chinese calligraphic character databases

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Online Random Shuffling of Large Database Tables

IEEE Transactions on Knowledge and Data Engineering
Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more

Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more
Combining linear programming and clustering techniques for the classification of research centers

AI Communications
ST-DBSCAN: An algorithm for clustering spatial-temporal data

Data & Knowledge Engineering
Efficient bottom-up hybrid hierarchical clustering techniques for protein sequence classification

Pattern Recognition
Rapid and brief communication: Classification of run-length encoded binary data

Pattern Recognition
A dimensionality reduction algorithm and its application for interactive visualization

Journal of Visual Languages and Computing
Constrained data clustering by depth control and progressive constraint relaxation

The VLDB Journal — The International Journal on Very Large Data Bases
Locally adaptive metrics for clustering high dimensional data

Data Mining and Knowledge Discovery
Can exclusive clustering on streaming data be achieved?

ACM SIGKDD Explorations Newsletter
Novel multi-centroid, multi-run sampling schemes for $K$-medoids-based algorithms

International Journal of Knowledge-based and Intelligent Engineering Systems
pPOP: Fast yet accurate parallel hierarchical clustering using partitioning

Data & Knowledge Engineering
NOCEA: A rule-based evolutionary algorithm for efficient and effective clustering of massive high-dimensional databases

Applied Soft Computing
Classification of large data sets with mixture models via sufficient EM

Computational Statistics & Data Analysis
A user-oriented contents recommendation system in peer-to-peer architecture

Expert Systems with Applications: An International Journal
Exploratory spatio-temporal data mining and visualization

Journal of Visual Languages and Computing
Supporting ranking and clustering as generalized order-by and group-by

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Trajectory clustering: a partition-and-group framework

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Building statistical models and scoring with UDFs

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
New Algorithms for Efficient High-Dimensional Nonparametric Classification

The Journal of Machine Learning Research
Exploiting parallelism to support scalable hierarchical clustering

Journal of the American Society for Information Science and Technology
Quality-Aware Sampling and Its Applications in Incremental Data Mining

IEEE Transactions on Knowledge and Data Engineering
MESO: Supporting Online Decision Making in Autonomic Computing Systems

IEEE Transactions on Knowledge and Data Engineering
Adaptive real-time anomaly detection with incremental clustering

Information Security Tech. Report
Fast agglomerative hierarchical clustering algorithm using Locality-Sensitive Hashing

Knowledge and Information Systems
Availability of multi-object operations

NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Focused crawling with scalable ordinal regression solvers

Proceedings of the 24th international conference on Machine learning
Support cluster machine

Proceedings of the 24th international conference on Machine learning
A new data clustering approach: Generalized cellular automata

Information Systems
A k-mean clustering algorithm for mixed numeric and categorical data

Data & Knowledge Engineering
Cell trees: An adaptive synopsis structure for clustering multi-dimensional on-line data streams

Data & Knowledge Engineering
Network anomaly detection with incomplete audit data

Computer Networks: The International Journal of Computer and Telecommunications Networking
Xproj: a framework for projected structural clustering of xml documents

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Constraint-driven clustering

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Joint cluster analysis of attribute and relationship data withouta-priori specification of the number of clusters

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A framework for classification and segmentation of massive audio data streams

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Interactive high-dimensional index for large Chinese calligraphic character databases

ACM Transactions on Asian Language Information Processing (TALIP)
GAPS: A clustering method using a new point symmetry-based distance measure

Pattern Recognition
A new intrusion detection system using support vector machines and hierarchical clustering

The VLDB Journal — The International Journal on Very Large Data Bases
Plan selection based on query clustering

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Algorithms for clustering high dimensional and distributed data

Intelligent Data Analysis
RIC: Parameter-free noise-robust clustering

ACM Transactions on Knowledge Discovery from Data (TKDD)
Top-Down Parameter-Free Clustering of High-Dimensional Categorical Data

IEEE Transactions on Knowledge and Data Engineering
A neural-network-based approach to detecting rectangular objects

Neurocomputing
A framework for clustering evolving data streams

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A shrinking-based approach for multi-dimensional data analysis

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Data bubbles for non-vector data: speeding-up hierarchical clustering in arbitrary metric spaces

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Visualization-informed noise elimination and its application in processing high-spatial-resolution remote sensing imagery

Computers & Geosciences
Compressing large boolean matrices using reordering techniques

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
A framework for projected clustering of high dimensional data streams

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Cufres: clustering using fuzzy representative eventsselection for the fault recognition problem intelecommunication networks

Proceedings of the ACM first Ph.D. workshop in CIKM
LEGClust—A Clustering Algorithm Based on Layered Entropic Subgraphs

IEEE Transactions on Pattern Analysis and Machine Intelligence
Diva: a variance-based clustering approach for multi-type relational data

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Nugget discovery in visual exploration environments by query consolidation

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Grid-based subspace clustering over data streams

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Statistical methods for automated generation of service engagement staffing plans

IBM Journal of Research and Development - Business optimization
On dominating your neighborhood profitably

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Immune-inspired incremental feature selection technology to data streams

Applied Soft Computing
Automatic kernel clustering with a Multi-Elitist Particle Swarm Optimization Algorithm

Pattern Recognition Letters
Managing discoveries in the visual analytics process

ACM SIGKDD Explorations Newsletter - Special issue on visual analytics
Accelerating k-medoid-based algorithms through metric access methods

Journal of Systems and Software
Referential hierarchical clustering algorithm based upon principal component analysis and genetic algorithm

ACOS'07 Proceedings of the 6th Conference on WSEAS International Conference on Applied Computer Science - Volume 6
Spatio-temporal discretization for sequential pattern mining

Proceedings of the 2nd international conference on Ubiquitous information management and communication
Exploring the relationship between software project duration and risk exposure: A cluster analysis

Information and Management
Unsupervised video shot detection using clustering ensemble with a color global scale-invariant feature transform descriptor

Journal on Image and Video Processing - Color in Image and Video Processing
Fast mining of distance-based outliers in high-dimensional datasets

Data Mining and Knowledge Discovery
Special Section: Point-Based Graphics: Fast vector quantization for efficient rendering of compressed point-clouds

Computers and Graphics
Agent-based simulation of electronic marketplaces with decision support

Proceedings of the 2008 ACM symposium on Applied computing
Clustering techniques utilized in web usage mining

AIKED'06 Proceedings of the 5th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases
A scalable sampling scheme for clustering in network traffic analysis

Proceedings of the 2nd international conference on Scalable information systems
Scaling clustering algorithm for data with categorical attributes

ICCOMP'05 Proceedings of the 9th WSEAS International Conference on Computers
Effective clustering and boundary detection algorithm based on Delaunay triangulation

Pattern Recognition Letters
A general grid-clustering approach

Pattern Recognition Letters
Mining multiple-level fuzzy blocks from multidimensional data

Fuzzy Sets and Systems
Outlier-robust clustering using independent components

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Approximation algorithms for clustering uncertain data

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Enhanced correlation search technique for clustering cancer gene expression data

SSIP'06 Proceedings of the 6th WSEAS International Conference on Signal, Speech and Image Processing
Website browsing aid: A navigation graph-based recommendation system

Decision Support Systems
Automatic clustering and boundary detection algorithm based on adaptive influence function

Pattern Recognition
Tree-based partition querying: a methodology for computing medoids in large spatial datasets

The VLDB Journal — The International Journal on Very Large Data Bases
SS-ClusterTree: a subspace clustering based indexing algorithm over high-dimensional image features

CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
CURIO: a fast outlier and outlier cluster detection algorithm for large datasets

AIDM '07 Proceedings of the 2nd international workshop on Integrating artificial intelligence and data mining - Volume 84
AN ACCELERATED ALGORITHM FOR DENSITY ESTIMATION IN LARGE DATABASES USING GAUSSIAN MIXTURES

Cybernetics and Systems
Selecting valuable training samples for SVMs via data structure analysis

Neurocomputing
A non-supervised approach for repeated sequence detection in TV broadcast streams

Image Communication
SPIRAL: efficient and exact model identification for hidden Markov models

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Context-aware query suggestion by mining click-through and session data

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast identification of visual documents using local descriptors

Proceedings of the eighth ACM symposium on Document engineering
Summarizing spatial data streams using ClusterHulls

Journal of Experimental Algorithmics (JEA)
Higher order mining

ACM SIGKDD Explorations Newsletter
Improved search strategies and extensions to k-medoids-based clustering algorithms

International Journal of Business Intelligence and Data Mining
The 3DVDM Approach: A Case Study with Clickstream Data

Visual Data Mining
Locally Scaled Density Based Clustering

ICANNGA '07 Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms, Part I
Clustering Streaming Time Series Using CBC

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
Adaptive Mining the Approximate Skyline over Data Stream

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
Varying Density Spatial Clustering Based on a Hierarchical Tree

MLDM '07 Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition
A Novel Spatial Clustering Algorithm with Sampling

MDAI '07 Proceedings of the 4th international conference on Modeling Decisions for Artificial Intelligence
Privacy Preserving BIRCH Algorithm for Clustering over Arbitrarily Partitioned Databases

ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
A Visual and Interactive Data Exploration Method for Large Data Sets and Clustering

ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
E-Stream: Evolution-Based Technique for Stream Clustering

ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
A Dynamic Clustering Algorithm for Mobile Objects

PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Matching Partitions over Time to Reliably Capture Local Clusters in Noisy Domains

PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Patch Relational Neural Gas --- Clustering of Huge Dissimilarity Datasets

ANNPR '08 Proceedings of the 3rd IAPR workshop on Artificial Neural Networks in Pattern Recognition
Detecting Current Outliers: Continuous Outlier Detection over Time-Series Data Streams

DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Hierarchical, Parameter-Free Community Discovery

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Clustering Distributed Sensor Data Streams

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
UNSUPERVISED ANOMALY DETECTION IN LARGE DATABASES USING BAYESIAN NETWORKS

Applied Artificial Intelligence
Nonlinear clustering-based support vector machine for large data sets

Optimization Methods & Software - Mathematical programming in data mining and machine learning
A framework for estimating complex probability density structures in data streams

Proceedings of the 17th ACM conference on Information and knowledge management
Aggregated cross-media news visualization and personalization

MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Incremental clustering of dynamic data streams using connectivity based representative points

Data & Knowledge Engineering
Non-negative matrix factorization for semi-supervised data clustering

Knowledge and Information Systems
Finding cohesive clusters for analyzing knowledge communities

Knowledge and Information Systems
CONTOUR: an efficient algorithm for discovering discriminating subsequences

Data Mining and Knowledge Discovery
Scalable 2-Pass Data Mining Technique for Large Scale Spatio-temporal Datasets

KES '07 Knowledge-Based Intelligent Information and Engineering Systems and the XVII Italian Workshop on Neural Networks on Proceedings of the 11th International Conference
NPClu: An approach for clustering spatially extended objects

Intelligent Data Analysis
Facilitating discovery on the private web using dataset digests

Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Multifractal-based cluster hierarchy optimisation algorithm

International Journal of Business Intelligence and Data Mining
A scalable framework for cluster ensembles

Pattern Recognition
A multi-prototype clustering algorithm

Pattern Recognition
TuG synopses for approximate query answering

ACM Transactions on Database Systems (TODS)
Efficiently tracing clusters over high-dimensional on-line data streams

Data & Knowledge Engineering
Neighbor-based pattern detection for windows over streaming data

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Patch clustering for massive data sets

Neurocomputing
A Unified Indexing Structure for Efficient Cross-Media Retrieval

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Efficiently Clustering Probabilistic Data Streams

APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Nonlinear Data Analysis Using a New Hybrid Data Clustering Algorithm

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Clustering by pattern similarity

Journal of Computer Science and Technology
PFHC: A clustering algorithm based on data partitioning for unevenly distributed datasets

Fuzzy Sets and Systems
Towards understanding hierarchical clustering: A data distribution perspective

Neurocomputing
Models for association rules based on clustering and correlation

Intelligent Data Analysis
Effective spatial clustering methods for optimal facility establishment

Intelligent Data Analysis
Top-k typicality queries and efficient query answering methods on large databases

The VLDB Journal — The International Journal on Very Large Data Bases
Preface: an overview on learning from data streams

New Generation Computing
A holistic approach for resource-aware adaptive data stream mining

New Generation Computing
Comparing the performance of traditional cluster analysis, self-organizing maps and fuzzy C-means method for strategic grouping

Expert Systems with Applications: An International Journal
Learning, indexing, and diagnosing network faults

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
DataLens: making a good first impression

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Median Topographic Maps for Biomedical Data Sets

Similarity-Based Clustering
A hybrid novelty score and its use in keystroke dynamics-based user authentication

Pattern Recognition
FARICS: a method of mining spatial association rules and collocations using clustering and Delaunay diagrams

Journal of Intelligent Information Systems
NPUST: An Efficient Clustering Algorithm Using Partition Space Technique for Large Databases

IEA/AIE '09 Proceedings of the 22nd International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: Next-Generation Applied Intelligence
GF-DBSCAN: a new efficient and effective data clustering technique for large databases

MUSP'09 Proceedings of the 9th WSEAS international conference on Multimedia systems & signal processing
Mining in Large Noisy Domains

Journal of Data and Information Quality (JDIQ)
SubCOID: an attempt to explore cluster-outlier iterative detection approach to multi-dimensional data analysis in subspace

Proceedings of the 46th Annual Southeast Regional Conference on XX
FuzzyShrinking: improving shrinking-based data mining algorithms using fuzzy concept for multi-dimensional data

Proceedings of the 46th Annual Southeast Regional Conference on XX
A hybrid recommendation procedure for new items using preference boundary

Proceedings of the 11th International Conference on Electronic Commerce
Clustering by exceptions

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
CSBIterKmeans: A New Clustering Algorithm Based on Quantitative Assessment of the Clustering Quality

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
A Comparative Study of Outlier Detection Algorithms

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
A new clustering approach using similarity of patterns texture

Intelligent Data Analysis
A comprehensive survey of numeric and symbolic outlier mining techniques

Intelligent Data Analysis
On classification and segmentation of massive audio data streams

Knowledge and Information Systems
A Neighborhood Search Method for Link-Based Tag Clustering

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
An Outlier Detection Algorithm Based on Arbitrary Shape Clustering

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Detecting Projected Outliers in High-Dimensional Data Streams

DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Clustering for Video Retrieval

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Distributed clustering based on sampling local density estimates

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Fast likelihood search for hidden Markov models

ACM Transactions on Knowledge Discovery from Data (TKDD)
Clustering in the membership embedding space

International Journal of Knowledge Engineering and Soft Data Paradigms
Rough-DBSCAN: A fast hybrid density based clustering method for large data sets

Pattern Recognition Letters
Extending fuzzy and probabilistic clustering to very large data sets

Computational Statistics & Data Analysis
Rightnow eservice center: internet customer service using a self-learning knowledge base

IAAI'02 Proceedings of the 14th conference on Innovative applications of artificial intelligence - Volume 1
Indexing 3-D human motion repositories for content-based retrieval

IEEE Transactions on Information Technology in Biomedicine - Special section on computational intelligence in medical systems
A framework for context sensitive services: A knowledge discovery based approach

Decision Support Systems
On-line discovery of flock patterns in spatio-temporal data

Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
K-means clustering versus validation measures: a data-distribution perspective

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Incremental Learning and Memory Consolidation of Whole Body Human Motion Primitives

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Clustering: A neural network approach

Neural Networks
Two-level k-means clustering algorithm for k-τ relationship establishment and linear-time classification

Pattern Recognition
Finding approximate solutions to combinatorial problems with very large data sets using BIRCH

Computational Statistics & Data Analysis
EIDBSCAN: An Extended Improving DBSCAN algorithm with sampling techniques

International Journal of Business Intelligence and Data Mining
RACK: RApid clustering using K-means algorithm

CASE'09 Proceedings of the fifth annual IEEE international conference on Automation science and engineering
Using trees to depict a forest

Proceedings of the VLDB Endowment
A shared execution strategy for multiple pattern mining requests over streaming data

Proceedings of the VLDB Endowment
Trajectory Clustering via Effective Partitioning

FQAS '09 Proceedings of the 8th International Conference on Flexible Query Answering Systems
Intelligent Data Granulation on Load: Improving Infobright's Knowledge Grid

FGIT '09 Proceedings of the 1st International Conference on Future Generation Information Technology
Fast Single-Link Clustering Method Based on Tolerance Rough Set Model

RSFDGrC '09 Proceedings of the 12th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
Active multi-view object search on a humanoid head

ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Bayesian k-Means as a "Maximization-expectation" algorithm

Neural Computation
A novel manufacturing defect detection method using association rule mining techniques

Expert Systems with Applications: An International Journal
A case-based reasoning system for PCB defect prediction

Expert Systems with Applications: An International Journal
Adherence clustering: an efficient method for mining market-basket clusters

Information Systems
Text clustering algorithm based on spectral graph seriation

CCDC'09 Proceedings of the 21st annual international conference on Chinese control and decision conference
Clustering large data sets based on data compression technique and weighted quality measures

FUZZ-IEEE'09 Proceedings of the 18th international conference on Fuzzy Systems
Learning similarity metrics for event identification in social media

Proceedings of the third ACM international conference on Web search and data mining
Towards optimal indexing for relevance feedback in large image databases

IEEE Transactions on Image Processing
A survey of collaborative filtering techniques

Advances in Artificial Intelligence
Active constrained clustering with multiple cluster representatives

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
A grid-based clustering method for mining frequent trips from large-scale, event-based telematics datasets

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Hybrid clustering algorithm

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
A signal filter based clustering algorithm

WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Caractérisation de la densité de trafic et de son évolution à partir de trajectoires d'objets mobiles

Proceedings of the 5th French-Speaking Conference on Mobility and Ubiquity Computing
Fast UDFs to compute sufficient statistics on large data sets exploiting caching and sampling

Data & Knowledge Engineering
Overlap pattern synthesis with an efficient nearest neighbor classifier

Pattern Recognition
Optimization on Lie manifolds and pattern recognition

Pattern Recognition
A statistics-based approach to control the quality of subclusters in incremental gravitational clustering

Pattern Recognition
Clustering of time series data-a survey

Pattern Recognition
Environmental chemistry through intelligent atmospheric data analysis

Environmental Modelling & Software
Analyzing knowledge communities using foreground and background clusters

ACM Transactions on Knowledge Discovery from Data (TKDD)
Anomaly intrusion detection by clustering transactional audit streams in a host computer

Information Sciences: an International Journal
Data clustering: 50 years beyond K-means

Pattern Recognition Letters
Mining comprehensible clustering rules with an evolutionary algorithm

GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartII
An applicable hierarchical clustering algorithm for content-based image retrieval

MIRAGE'07 Proceedings of the 3rd international conference on Computer vision/computer graphics collaboration techniques
A comparative analysis of clustering algorithms applied to load profiling

MLDM'03 Proceedings of the 3rd international conference on Machine learning and data mining in pattern recognition
AGRID: an efficient algorithm for clustering large high-dimensional datasets

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Multi-level clustering and reasoning about its clusters using region connection calculus

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
An efficient cell-based clustering method for handling large, high-dimensional data

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Optimized clustering for anomaly intrusion detection

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
A clustering algorithm based on mechanics

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Affection factor optimization in data field clustering

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Distributed, hierarchical clustering and summarization in sensor networks

APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Outlier detection with streaming dyadic decomposition

ICDM'07 Proceedings of the 7th industrial conference on Advances in data mining: theoretical aspects and applications
Efficiently detecting clusters of mobile objects in the presence of dense noise

Proceedings of the 2010 ACM Symposium on Applied Computing
A document recommendation system based on clustering P2P networks

CDVE'07 Proceedings of the 4th international conference on Cooperative design, visualization, and engineering
Database implementation of a model-free classifier

ADBIS'07 Proceedings of the 11th East European conference on Advances in databases and information systems
Hybrid approaches for clustering

PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
Rough core vector clustering

PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
Continuous adaptive mining the thin skylines over evolving data stream

ICDCIT'07 Proceedings of the 4th international conference on Distributed computing and internet technology
Clustering moving objects in spatial networks

DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Continuous medoid queries over moving objects

SSTD'07 Proceedings of the 10th international conference on Advances in spatial and temporal databases
High-dimensional indexing: transformational approaches to high-dimensional range and similarity searches

High-dimensional indexing: transformational approaches to high-dimensional range and similarity searches
Integrating induction and deduction for noisy data mining

Information Sciences: an International Journal
Enhanced k-means clustering for patient reported outcome

CEA'10 Proceedings of the 4th WSEAS international conference on Computer engineering and applications
On cluster tree for nested and multi-density data clustering

Pattern Recognition
Data compression by volume prototypes for streaming data

Pattern Recognition
Agent-based distributed data mining: the KDEC scheme

Intelligent information agents
Using Hybrid Hierarchical K-means (HHK) clustering algorithm for protein sequence motif Super-Rule-Tree (SRT) structure construction

International Journal of Data Mining and Bioinformatics
Anomaly intrusion detection for evolving data stream based on semi-supervised learning

ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I
Enhancing principal direction divisive clustering

Pattern Recognition
Detecting outliers on arbitrary data streams using anytime approaches

Proceedings of the First International Workshop on Novel Data Stream Pattern Mining Techniques
Facilitating discovery on the private web using dataset digests

International Journal of Metadata, Semantics and Ontologies
Clustering based fuzzy logic for multimodal sensor networks: A preprocessing to decision fusion

Journal of Ambient Intelligence and Smart Environments
Clustering by synchronization

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Towards mobility-based clustering

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Enhancing effectiveness of density-based outlier mining scheme with density-similarity-neighbor-based outlier factor

Expert Systems with Applications: An International Journal
Application of a hybrid of genetic algorithm and particle swarm optimization algorithm for order clustering

Decision Support Systems
PHD: an efficient data clustering scheme using partition space technique for knowledge discovery in large databases

Applied Intelligence
Inter-image outliers and their application to image classification

Pattern Recognition
A novel intrusion detection system based on hierarchical clustering and support vector machines

Expert Systems with Applications: An International Journal
Topographic mapping of large dissimilarity data sets

Neural Computation
Column-based cluster and bar axis density in parallel coordinates

Proceedings of the 3rd International Symposium on Visual Information Communication
Pattern discovery in distributed databases

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Approximate pairwise clustering for large data sets via sampling plus extension

Pattern Recognition
Optimizing all-nearest-neighbor queries with trigonometric pruning

SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Distance based fast hierarchical clustering method for large datasets

RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
Collective taxonomizing: A collaborative approach to organizing document repositories

Decision Support Systems
Clustering-based geometric support vector machines

LSMS/ICSEE'10 Proceedings of the 2010 international conference on Life system modeling and simulation and intelligent computing, and 2010 international conference on Intelligent computing for sustainable energy and environment: Part II
A time-efficient pattern reduction algorithm for k-means clustering

Information Sciences: an International Journal
Distributed antipole clustering for efficient data search and management in Euclidean and metric spaces

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Towards improving subspace data analysis

Proceedings of the 48th Annual Southeast Regional Conference
Inter-dimensional fuzzy clustering

Proceedings of the 48th Annual Southeast Regional Conference
Distance-based outlier detection: consolidation and renewed bearing

Proceedings of the VLDB Endowment
Clustering distributed sensor data streams using local processing and reduced communication

Intelligent Data Analysis - Ubiquitous Knowledge Discovery
Multimedia data mining: state of the art and challenges

Multimedia Tools and Applications
Multi-source shared nearest neighbours for multi-modal image clustering

Multimedia Tools and Applications
The discovery of hierarchical cluster structures assisted by a visualization technique

ICONIP'10 Proceedings of the 17th international conference on Neural information processing: theory and algorithms - Volume Part I
MEC --Monitoring Clusters' Transitions

Proceedings of the 2010 conference on STAIRS 2010: Proceedings of the Fifth Starting AI Researchers' Symposium
EPIC: efficient integration of partitional clustering algorithms for classification

SEAL'10 Proceedings of the 8th international conference on Simulated evolution and learning
MSDBSCAN: multi-density scale-independent clustering algorithm based on DBSCAN

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Spatial neighborhood clustering based on data field

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
A top-down approach for hierarchical cluster exploration by visualization

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
An improved KNN based outlier detection algorithm for large datasets

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Simulation of DNA damage clustering after proton irradiation using an adapted DBSCAN algorithm

Computer Methods and Programs in Biomedicine
Boosting the scalability of botnet detection using adaptive traffic sampling

Proceedings of the 6th ACM Symposium on Information, Computer and Communications Security
Fast outlier detection for very large log data

Expert Systems with Applications: An International Journal
Aseismic ability estimation of school building using predictive data mining models

Expert Systems with Applications: An International Journal
Active learning and subspace clustering for anomaly detection

Intelligent Data Analysis
Learning latent variable models from distributed and abstracted data

Information Sciences: an International Journal
Review: data mining: Past, present and future

The Knowledge Engineering Review
XML data clustering: An overview

ACM Computing Surveys (CSUR)
Parallel WaveCluster: A linear scaling parallel clustering algorithm implementation with application to very large datasets

Journal of Parallel and Distributed Computing
Generalized scatter plots

Information Visualization
Adaptive clustering and interactive visualizations to support the selection of video clips

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Aggregate distance based clustering using fibonacci series-FIBCLUS

APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
Precise anytime clustering of noisy sensor data with logarithmic complexity

Proceedings of the Fifth International Workshop on Knowledge Discovery from Sensor Data
Enhancing grid-density based clustering for high dimensional data

Journal of Systems and Software
A novel ant-based clustering algorithm using the kernel method

Information Sciences: an International Journal
Tolerance rough set theory based data summarization for clustering large datasets

Transactions on rough sets XIV
Exploratory monitoring of large-scale networks using clustering algorithms

Proceedings of the First International Workshop on Data Mining for Service and Maintenance
Approximate kernel k-means: solution to large scale kernel clustering

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
INCONCO: interpretable clustering of numerical and categorical objects

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Behavioural Proximity Discovery: an adaptive approach for root cause analysis

International Journal of Business Intelligence and Data Mining
Summarizing cluster evolution in dynamic environments

ICCSA'11 Proceedings of the 2011 international conference on Computational science and its applications - Volume Part II
Spatial clustering to uncluttering map visualization in SOLAP

ICCSA'11 Proceedings of the 2011 international conference on Computational science and its applications - Volume Part I
Semantically-guided clustering of text documents via frequent subgraphs discovery

ISMIS'11 Proceedings of the 19th international conference on Foundations of intelligent systems
SpectralCAT: Categorical spectral clustering of numerical and nominal data

Pattern Recognition
Partitioning hard clustering algorithms based on multiple dissimilarity matrices

Pattern Recognition
Density based subspace clustering over dynamic data

SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Fast and accurate trajectory streams clustering

SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Comparing clustering and metaclustering algorithms

MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Online and offline trend cluster discovery in spatially distributed data streams

MSM'10/MUSE'10 Proceedings of the 2010 international conference on Analysis of social media and ubiquitous data
Mining Concept Sequences from Large-Scale Search Logs for Context-Aware Query Suggestion

ACM Transactions on Intelligent Systems and Technology (TIST)
A survey: hybrid evolutionary algorithms for cluster analysis

Artificial Intelligence Review
Multi density DBSCAN

IDEAL'11 Proceedings of the 12th international conference on Intelligent data engineering and automated learning
OLAP over continuous domains via density-based hierarchical clustering

KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part II
Non-separable transforms for clustering trajectories

KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part II
A clustering algorithm for multiple data streams based on spectral component similarity

Information Sciences: an International Journal
CLAP: Collaborative pattern mining for distributed information systems

Decision Support Systems
Improvements in image categorization using codebook ensembles

Image and Vision Computing
Summarization and matching of density-based clusters in streaming environments

Proceedings of the VLDB Endowment
Anomaly intrusion detection based on clustering a data stream

ISC'06 Proceedings of the 9th international conference on Information Security
Hierarchical indexing structure for 3d human motions

MMM'07 Proceedings of the 13th international conference on Multimedia Modeling - Volume Part I
Contextual maps for browsing huge document collections

ISMIS'06 Proceedings of the 16th international conference on Foundations of Intelligent Systems
A voronoi diagram approach to autonomous clustering

DS'06 Proceedings of the 9th international conference on Discovery Science
Clustering based on compressed data for categorical and mixed attributes

SSPR'06/SPR'06 Proceedings of the 2006 joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
An immune network for contextual text data clustering

ICARIS'06 Proceedings of the 5th international conference on Artificial Immune Systems
LOCAR: local compression of alternative routes

Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
A maximum profit coverage algorithm with application to small molecules cluster identification

WEA'06 Proceedings of the 5th international conference on Experimental Algorithms
Mining outliers in spatial networks

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Ranking outliers using symmetric neighborhood relationship

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
On robust and effective k-anonymity in large databases

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Clustering mixed type attributes in large dataset

ISPA'05 Proceedings of the Third international conference on Parallel and Distributed Processing and Applications
Cluster aggregate inequality and multi-level hierarchical clustering

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Efficient processing of ranked queries with sweeping selection

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Grid-ODF: detecting outliers effectively and efficiently in large multi-dimensional databases

CIS'05 Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part I
An auto-stopped hierarchical clustering algorithm integrating outlier detection algorithm

WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
A clustering algorithm based absorbing nearest neighbors

WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
An incremental document clustering for the large document database

AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
A coarse grained parallel algorithm for closest larger ancestors in trees with applications to single link clustering

HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
Motion-Alert: automatic anomaly detection in massive moving objects

ISI'06 Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics
Shared execution strategy for neighbor-based pattern mining requests over streaming windows

ACM Transactions on Database Systems (TODS)
Generalized projected clustering in high-dimensional data streams

APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
SCUBA: scalable cluster-based algorithm for evaluating continuous spatio-temporal queries on moving objects

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Echidna: efficient clustering of hierarchical data for network traffic analysis

NETWORKING'06 Proceedings of the 5th international IFIP-TC6 conference on Networking Technologies, Services, and Protocols; Performance of Computer and Communication Networks; Mobile and Wireless Communications Systems
A peer-to-peer CF-Recommendation for ubiquitous environment

PRIMA'06 Proceedings of the 9th Pacific Rim international conference on Agent Computing and Multi-Agent Systems
HOV3: an approach to visual cluster analysis

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
A fast implementation of the EM algorithm for mixture of multinomials

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Applying the Mahalanobis-Taguchi strategy for software defect diagnosis

Automated Software Engineering
Incremental clustering for trajectories

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
Attribute outlier detection over data streams

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
SIC-means: a semi-fuzzy approach for clustering data streams using c-means

ANNPR'10 Proceedings of the 4th IAPR TC3 conference on Artificial Neural Networks in Pattern Recognition
Clustering very large dissimilarity data sets

ANNPR'10 Proceedings of the 4th IAPR TC3 conference on Artificial Neural Networks in Pattern Recognition
Weighted k-means for density-biased clustering

DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
A clustering model based on matrix approximation with applications to cluster system log files

ECML'05 Proceedings of the 16th European conference on Machine Learning
An improvement algorithm for accessing patterns through clustering in interactive VRML environments

PCM'04 Proceedings of the 5th Pacific Rim conference on Advances in Multimedia Information Processing - Volume Part III
Succinct and informative cluster descriptions for document repositories

WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Scalable clustering using graphics processors

WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Dynamic incremental data summarization for hierarchical clustering

WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
A distributed algorithm for outlier detection in a large database

DNIS'05 Proceedings of the 4th international conference on Databases in Networked Information Systems
HYBRID: from atom-clusters to molecule-clusters

FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part I
Dynamic pattern mining: an incremental data clustering approach

Journal on Data Semantics II
Mining quantitative association rules on overlapped intervals

ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
A grid-based clustering algorithm for high-dimensional data streams

ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
On autonomous k-means clustering

ISMIS'05 Proceedings of the 15th international conference on Foundations of Intelligent Systems
PatZip: pattern-preserved spatial data compression

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Determining the number of clusters using information entropy for mixed data

Pattern Recognition
Efficient trade-off between speed processing and accuracy in summarizing data streams

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
An overview of web data clustering practices

EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
Clustering large datasets using cobweb and k-means in tandem

AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
Towards an ontology-based spatial clustering framework

AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence
Towards the adaptive organization: formation and conservative reconfiguration of agents coalitions

AIS-ADM 2005 Proceedings of the 2005 international conference on Autonomous Intelligent Systems: agents and Data Mining
Medoid queries in large spatial databases

SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
On discovering moving clusters in spatio-temporal data

SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
ADWICE – anomaly detection with real-time incremental clustering

ICISC'04 Proceedings of the 7th international conference on Information Security and Cryptology
Privacy preserving BIRCH algorithm for clustering over vertically partitioned databases

SDM'06 Proceedings of the Third VLDB international conference on Secure Data Management
Adaptive web usage profiling

WebKDD'05 Proceedings of the 7th international conference on Knowledge Discovery on the Web: advances in Web Mining and Web Usage Analysis
On clustering techniques for change diagnosis in data streams

WebKDD'05 Proceedings of the 7th international conference on Knowledge Discovery on the Web: advances in Web Mining and Web Usage Analysis
On approximation algorithms for data mining applications

Efficient Approximation and Online Algorithms
KIDBSCAN: a new efficient data clustering algorithm

ICAISC'06 Proceedings of the 8th international conference on Artificial Intelligence and Soft Computing
Clustering categorical data using qualified nearest neighbors selection model

AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Integrative parameter-free clustering of data with mixed type attributes

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Spatio-temporal similarity analysis between trajectories on road networks

ER'05 Proceedings of the 24th international conference on Perspectives in Conceptual Modeling
Interactive retrieval of video sequences from local feature dynamics

AMR'05 Proceedings of the Third international conference on Adaptive Multimedia Retrieval: user, context, and feedback
Scalable k-means++

Proceedings of the VLDB Endowment
Mining uncertain data streams using clustering feature decision trees

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
StreamKM++: A clustering algorithm for data streams

Journal of Experimental Algorithmics (JEA)
A clustering approach using weighted similarity majority margins

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Feature selection and clustering in software quality prediction

EASE'07 Proceedings of the 11th international conference on Evaluation and Assessment in Software Engineering
Statistical modeling of dissimilarity increments for d-dimensional data: Application in partitional clustering

Pattern Recognition
SpaGRID: a spatial grid framework for high dimensional medical databases

HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part I
AnyOut: anytime outlier detection on streaming data

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Survey on particle swarm optimization based clustering analysis

SIDE'12 Proceedings of the 2012 international conference on Swarm and Evolutionary Computation
Mining temporal patterns in popularity of web items

Information Sciences: an International Journal
Exploiting constraint inconsistence for dimension selection in subspace clustering: A semi-supervised approach

Neurocomputing
Objective function-based clustering

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
A density-based spatial clustering algorithm considering both spatial proximity and attribute similarity

Computers & Geosciences
Cluster analysis for strategic management: a case study of IKEA

ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part II
Aggregating and disaggregating flexibility objects

SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
SOStream: self organizing density-based clustering over data stream

MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
CloudVista: interactive and economical visual cluster analysis for big data in the cloud

Proceedings of the VLDB Endowment
Cluster_KDD: a visual clustering and knowledge discovery platform based on concept lattice

ICSI'12 Proceedings of the Third international conference on Advances in Swarm Intelligence - Volume Part II
An approach to reshaping clusters for nearest neighbor search

IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
Dynamic k-means: a clustering technique for moving object trajectories

International Journal of Intelligent Information and Database Systems
A new scalable parallel DBSCAN algorithm using the disjoint-set data structure

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
An OLAM-based framework for complex knowledge pattern discovery in distributed-and-heterogeneous-data-sources and cooperative information systems

DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Continuous kernel-based outlier detection over distributed data streams

ISPA'07 Proceedings of the 2007 international conference on Frontiers of High Performance Computing and Networking
Codebook design of keyblock based image retrieval

ICEC'07 Proceedings of the 6th international conference on Entertainment Computing
DQR: a probabilistic approach to diversified query recommendation

Proceedings of the 21st ACM international conference on Information and knowledge management
Efficient stochastic algorithms for document clustering

Information Sciences: an International Journal
Adaptive non-parametric identification of dense areas using cell phone records for urban analysis

Engineering Applications of Artificial Intelligence
Continuous adaptive outlier detection on distributed data streams

HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
Dynamic topography information landscapes: an incremental approach to visual knowledge discovery

DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
Credit-Card fraud profiling using a hybrid incremental clustering methodology

SUM'12 Proceedings of the 6th international conference on Scalable Uncertainty Management
Noise-enhanced clustering and competitive learning algorithms

Neural Networks
Ranked k-medoids: A fast and accurate rank-based partitioning algorithm for clustering large datasets

Knowledge-Based Systems
Enhancing density-based clustering: Parameter reduction and outlier detection

Information Systems
Mining neighbor-based patterns in data streams

Information Systems
Dynamic credit-card fraud profiling

MDAI'12 Proceedings of the 9th international conference on Modeling Decisions for Artificial Intelligence
Clustering based on rank distance with applications on DNA

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part V
Clustering and labeling of multi-dimensional mixed structured data

Search Computing
ASCCN: Arbitrary Shaped Clustering Method with Compatible Nucleoids

International Journal of Data Warehousing and Mining
Data Field for Hierarchical Clustering

International Journal of Data Warehousing and Mining
Spatial Clustering in SOLAP Systems to Enhance Map Visualization

International Journal of Data Warehousing and Mining
FINGERPRINT: Summarizing Cluster Evolution in Dynamic Environments

International Journal of Data Warehousing and Mining
Hamming Distance based Clustering Algorithm

International Journal of Information Retrieval Research
Weighted Fuzzy-Possibilistic C-Means Over Large Data Sets

International Journal of Data Warehousing and Mining
Finding homogeneous groups in trajectory streams

Proceedings of the Third ACM SIGSPATIAL International Workshop on GeoStreaming
Self-Organizing Tree Using Artificial Ants

Journal of Information Technology Research
A data partitioning approach for hierarchical clustering

Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication
Discovering inappropriate billings with local density based outlier detection method

AusDM '09 Proceedings of the Eighth Australasian Data Mining Conference - Volume 101
An efficient method for discovering motifs in large time series

ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part I
Scalable fine-grained behavioral clustering of HTTP-based malware

Computer Networks: The International Journal of Computer and Telecommunications Networking
A novel ant-based clustering algorithm using Renyi entropy

Applied Soft Computing
CS2: a new database synopsis for query estimation

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Effectively grouping trajectory streams

NFMCP'12 Proceedings of the First international conference on New Frontiers in Mining Complex Patterns
Scaling analytics applications with OpenCL for loosely coupled heterogeneous clusters

Proceedings of the ACM International Conference on Computing Frontiers
Sumblr: continuous summarization of evolving tweet streams

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Space sensitive cache dumping for post-silicon validation

Proceedings of the Conference on Design, Automation and Test in Europe
Adaptive monitoring: a framework to adapt passive monitoring using probing

Proceedings of the 8th International Conference on Network and Service Management
A sample-based hierarchical adaptive K-means clustering method for large-scale video retrieval

Knowledge-Based Systems
Data weighing mechanisms for clustering ensembles

Computers and Electrical Engineering
TSum: fast, principled table summarization

Proceedings of the Seventh International Workshop on Data Mining for Online Advertising
Scalable parallel OPTICS data clustering using graph algorithmic techniques

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Similarity queries: their conceptual evaluation, transformations, and processing

The VLDB Journal — The International Journal on Very Large Data Bases
Automatic player behavior analysis system using trajectory data in a massive multiplayer online game

Multimedia Tools and Applications
Leveraging microblogging big data with a modified density-based clustering approach for event awareness and topic ranking

Journal of Information Science
Clustering cubes with binary dimensions in one pass

Proceedings of the sixteenth international workshop on Data warehousing and OLAP
Is data clustering in adversarial settings secure?

Proceedings of the 2013 ACM workshop on Artificial intelligence and security
Data stream clustering: A survey

ACM Computing Surveys (CSUR)
A fast algorithm for clustering with mapreduce

ISNN'13 Proceedings of the 10th international conference on Advances in Neural Networks - Volume Part I
CRUDAW: a novel fuzzy technique for clustering records following user defined attribute weights

AusDM '12 Proceedings of the Tenth Australasian Data Mining Conference - Volume 134
A comprehensive study of idistance partitioning strategies for kNN queries and high-dimensional data indexing

BNCOD'13 Proceedings of the 29th British National conference on Big Data
Clustering spatial data streams for targeted alerting in disaster response

Proceedings of the 4th ACM SIGSPATIAL International Workshop on GeoStreaming
Energy-based function to evaluate data stream clustering

Advances in Data Analysis and Classification
Online fuzzy medoid based clustering algorithms

Neurocomputing
Competitive positioning and performance assessment in the construction industry

Expert Systems with Applications: An International Journal
Evolutionary k-means for distributed data sets

Neurocomputing
Mining stable patterns in multiple correlated databases

Decision Support Systems
A new interactive semi-supervised clustering model for large image database indexing

Pattern Recognition Letters
MR-DBSCAN: a scalable MapReduce-based DBSCAN algorithm for heavily skewed data

Frontiers of Computer Science: Selected Publications from Chinese Universities
Effects of resampling method and adaptation on clustering ensemble efficacy

Artificial Intelligence Review
Data integration techniques for the measurement of the reliability of sample variables

International Journal of Business Intelligence and Data Mining
Dealing with trajectory streams by clustering and mathematical transforms

Journal of Intelligent Information Systems
Shortest-linkage-based parallel hierarchical clustering on main-belt moving objects of the solar system

Future Generation Computer Systems
The k-modes type clustering plus between-cluster information for categorical data

Neurocomputing
Building fast decision trees from large training sets

Intelligent Data Analysis
Data stream dynamic clustering supported by Markov chain isomorphisms

Intelligent Data Analysis
Text mapping: Visualising unstructured, structured, and time-based text collections

Intelligent Decision Technologies - Knowledge Visualization
A multivariate fuzzy system applied for outliers detection

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology
Subspace clustering of high-dimensional data: an evolutionary approach

Applied Computational Intelligence and Soft Computing

Quantified Score

Hi-index	0.02

Visualization

Abstract

Finding useful patterns in large datasets has attracted considerable interest recently, and one of the most widely studied problems in this area is the identification of clusters, or densely populated regions, in a multi-dimensional dataset. Prior work does not adequately address the problem of large datasets and minimization of I/O costs.This paper presents a data clustering method named BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies), and demonstrates that it is especially suitable for very large databases. BIRCH incrementally and dynamically clusters incoming multi-dimensional metric data points to try to produce the best quality clustering with the available resources (i.e., available memory and time constraints). BIRCH can typically find a good clustering with a single scan of the data, and improve the quality further with a few additional scans. BIRCH is also the first clustering algorithm proposed in the database area to handle "noise" (data points that are not part of the underlying pattern) effectively.We evaluate BIRCH's time/space efficiency, data input order sensitivity, and clustering quality through several experiments. We also present a performance comparisons of BIRCH versus CLARANS, a clustering method proposed recently for large datasets, and show that BIRCH is consistently superior.