A framework for measuring changes in data characteristics
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Scalable algorithms for mining large databases
KDD '99 Tutorial notes of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
CACTUS—clustering categorical data using summaries
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining and the Web: past, present and future
Proceedings of the 2nd international workshop on Web information and data management
Clustering through decision tree construction
Proceedings of the ninth international conference on Information and knowledge management
Efficient discovery of error-tolerant frequent itemsets in high dimensions
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Document clustering with committees
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Cluster validity methods: part I
ACM SIGMOD Record
Constrained frequent pattern mining: a pattern-growth view
ACM SIGKDD Explorations Newsletter
Evaluation of hierarchical clustering algorithms for document datasets
Proceedings of the eleventh international conference on Information and knowledge management
COOLCAT: an entropy-based algorithm for categorical clustering
Proceedings of the eleventh international conference on Information and knowledge management
FREM: fast and robust EM clustering for large data sets
Proceedings of the eleventh international conference on Information and knowledge management
Squeezer: an efficient algorithm for clustering categorical data
Journal of Computer Science and Technology
On Clustering Validation Techniques
Journal of Intelligent Information Systems
A Decision Criterion for the Optimal Number of Clusters in Hierarchical Clustering
Journal of Global Optimization
Finding Localized Associations in Market Basket Data
IEEE Transactions on Knowledge and Data Engineering
A Scalable Approach to Balanced, High-Dimensional Clustering of Market-Baskets
HiPC '00 Proceedings of the 7th International Conference on High Performance Computing
Context-Based Similarity Measures for Categorical Databases
PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
A Data Set Oriented Approach for Clustering Algorithm Selection
PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
A Study on the Hierarchical Data Clustering Algorithm Based on Gravity Theory
PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
A Two-Level Method for Clustering DTDs
WAIM '00 Proceedings of the First International Conference on Web-Age Information Management
Interactive Clustering for Transaction Data
DaWaK '01 Proceedings of the Third International Conference on Data Warehousing and Knowledge Discovery
Self-Tuning Clustering: An Adaptive Clustering Method for Transaction Data
DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Fully Dynamic Clustering of Metric Data Sets
BNCOD 19 Proceedings of the 19th British National Conference on Databases: Advances in Databases
Efficient Hierarchical Clustering Algorithms Using Partially Overlapping Partitions
PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
Scalable Hierarchical Clustering Method for Sequences of Categorical Values
PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
On Data Clustering Analysis: Scalability, Constraints, and Validation
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Efficiently Mining Gene Expression Data via Integrated Clustering and Validation Techniques
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
On Clustering Using Random Walks
FST TCS '01 Proceedings of the 21st Conference on Foundations of Software Technology and Theoretical Computer Science
Efficiently Clustering Documents with Committees
PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Discovering cluster-based local outliers
Pattern Recognition Letters
Clustering DTDs: an interactive two-level approach
Journal of Computer Science and Technology
Finding hyper-structure in space: spatial parsing in 3D
The New Review of Hypermedia and Multimedia
A robust and efficient clustering algorithm based on cohesion self-merging
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering word senses from text
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
PHC: a fast partition and hierarchy-based clustering algorithm
Journal of Computer Science and Technology
Clustering binary data streams with K-means
DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Clustering Item Data Sets with Association-Taxonomy Similarity
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Segmenting Customer Transactions Using a Pattern-Based Clustering Approach
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
TSP: Mining Top-K Closed Sequential Patterns
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
An Efficient and Scalable Algorithm for Clustering XML Documents by Structure
IEEE Transactions on Knowledge and Data Engineering
GraphZip: a fast and automatic compression method for spatial data clustering
Proceedings of the 2004 ACM symposium on Applied computing
Exploration on the commonality of hierarchical clustering algorithms
ACM-SE 42 Proceedings of the 42nd annual Southeast regional conference
Hypergraph Models and Algorithms for Data-Pattern-Based Clustering
Data Mining and Knowledge Discovery
Efficient Disk-Based K-Means Clustering for Relational Databases
IEEE Transactions on Knowledge and Data Engineering
From sequential pattern mining to structured pattern mining: a pattern-growth approach
Journal of Computer Science and Technology
HARP: A Practical Projected Clustering Algorithm
IEEE Transactions on Knowledge and Data Engineering
Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach
IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Knowledge and Data Engineering
TSP: Mining top-k closed sequential patterns
Knowledge and Information Systems
Hierarchical Clustering Algorithms for Document Datasets
Data Mining and Knowledge Discovery
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
GCHL: A grid-clustering algorithm for high-dimensional very large spatial data bases
Pattern Recognition Letters
CLICKS: an effective algorithm for mining subspace clusters in categorical datasets
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Answering imprecise queries over web databases
VLDB '05 Proceedings of the 31st international conference on Very large data bases
A Shrinking-Based Clustering Approach for Multidimensional Data
IEEE Transactions on Knowledge and Data Engineering
Efficiently Mining Gene Expression Data via a Novel Parameterless Clustering Method
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
TCSOM: Clustering Transactions Using Self-Organizing Map
Neural Processing Letters
Learning States and Rules for Detecting Anomalies in Time Series
Applied Intelligence
Labeling Unclustered Categorical Data into Clusters Based on the Important Attribute Values
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
XMage: an image retrieval method based on partial similarity
Information Processing and Management: an International Journal
QROCK: A quick version of the ROCK algorithm for clustering of categorical data
Pattern Recognition Letters
MPM: a hierarchical clustering algorithm using matrix partitioning method for non-numeric data
Journal of Intelligent Information Systems
Finding centric local outliers in categorical/numerical spaces
Knowledge and Information Systems
Effective document clustering for large heterogeneous law firm collections
ICAIL '05 Proceedings of the 10th international conference on Artificial intelligence and law
Efficiently clustering transactional data with weighted coverage density
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Clicks: An effective algorithm for mining subspace clusters in categorical datasets
Data & Knowledge Engineering
A new data clustering approach: Generalized cellular automata
Information Systems
A k-mean clustering algorithm for mixed numeric and categorical data
Data & Knowledge Engineering
Network anomaly detection with incomplete audit data
Computer Networks: The International Journal of Computer and Telecommunications Networking
Hierarchical clustering of mixed data based on distance hierarchy
Information Sciences: an International Journal
Xproj: a framework for projected structural clustering of xml documents
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Eliminating fuzzy duplicates in data warehouses
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Strategies for Identifying Statistically Significant Dense Regions in Microarray Data
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A shrinking-based approach for multi-dimensional data analysis
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A genetic approach for efficient outlier detection in projected space
Pattern Recognition
ACOS'07 Proceedings of the 6th Conference on WSEAS International Conference on Applied Computer Science - Volume 6
Scaling clustering algorithm for data with categorical attributes
ICCOMP'05 Proceedings of the 9th WSEAS International Conference on Computers
ACM Transactions on Knowledge Discovery from Data (TKDD)
Bregman bubble clustering: A robust framework for mining dense clusters
ACM Transactions on Knowledge Discovery from Data (TKDD)
SemGrAM: integrating semantic graphs into association rule mining
AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
Enhanced correlation search technique for clustering cancer gene expression data
SSIP'06 Proceedings of the 6th WSEAS International Conference on Signal, Speech and Image Processing
Incremental clustering of mixed data based on distance hierarchy
Expert Systems with Applications: An International Journal
Clustering multidimensional sequences in spatial and temporal databases
Knowledge and Information Systems
Efficient mining of maximal frequent itemsets from databases on a cluster of workstations
Knowledge and Information Systems
Matching large ontologies: A divide-and-conquer approach
Data & Knowledge Engineering
Improvement of Jarvis-Patrick Clustering Based on Fuzzy Similarity
WILF '07 Proceedings of the 7th international workshop on Fuzzy Logic and Applications: Applications of Fuzzy Sets Theory
Categorical Data Clustering Using the Combinations of Attribute Values
ICCSA '08 Proceedings of the international conference on Computational Science and Its Applications, Part II
Data Reduction Method for Categorical Data Clustering
IBERAMIA '08 Proceedings of the 11th Ibero-American conference on AI: Advances in Artificial Intelligence
Adaptive workflow scheduling under resource allocation constraints and network dynamics
Proceedings of the VLDB Endowment
Determining the best K for clustering transactional datasets: A coverage density-based approach
Data & Knowledge Engineering
Mining Meaningful Student Groups Based on Communication History Records
KES '07 Knowledge-Based Intelligent Information and Engineering Systems and the XVII Italian Workshop on Neural Networks on Proceedings of the 11th International Conference
NPClu: An approach for clustering spatially extended objects
Intelligent Data Analysis
Multifractal-based cluster hierarchy optimisation algorithm
International Journal of Business Intelligence and Data Mining
Models for association rules based on clustering and correlation
Intelligent Data Analysis
NPUST: An Efficient Clustering Algorithm Using Partition Space Technique for Large Databases
IEA/AIE '09 Proceedings of the 22nd International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: Next-Generation Applied Intelligence
A spectral-based clustering algorithm for categorical data using data summaries
Proceedings of the 2nd Workshop on Data Mining using Matrices and Tensors
Proceedings of the 46th Annual Southeast Regional Conference on XX
Proceedings of the 46th Annual Southeast Regional Conference on XX
Supporting queries with imprecise constraints
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Context-Based Distance Learning for Categorical Data Clustering
IDA '09 Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII
Clustering heterogeneous data using clustering by compression
ICCOMP'09 Proceedings of the WSEAES 13th international conference on Computers
EIDBSCAN: An Extended Improving DBSCAN algorithm with sampling techniques
International Journal of Business Intelligence and Data Mining
XMage: An image retrieval method based on partial similarity
Information Processing and Management: an International Journal
SCALE: a scalable framework for efficiently clustering transactional data
Data Mining and Knowledge Discovery
A new method for clustering heterogeneous data: clustering by compression
WSEAS Transactions on Computers
Multiobjective genetic algorithm-based fuzzy clustering of categorical attributes
IEEE Transactions on Evolutionary Computation
Active constrained clustering with multiple cluster representatives
SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Reducing metadata complexity for faster table summarization
Proceedings of the 13th International Conference on Extending Database Technology
AGRID: an efficient algorithm for clustering large high-dimensional datasets
PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Electricity based external similarity of categorical attributes
PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Fuzzy clustering based ad recommendation for TV programs
EuroITV'07 Proceedings of the 5th European conference on Interactive TV: a shared experience
A new initialization method for clustering categorical data
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
A structure-based clustering on LDAP directory information
ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
Quantization-based clustering algorithm
Pattern Recognition
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
A fast divisive clustering algorithm using an improved discrete particle swarm optimizer
Pattern Recognition Letters
A novel intrusion detection system based on hierarchical clustering and support vector machines
Expert Systems with Applications: An International Journal
A data labeling method for clustering categorical data
Expert Systems with Applications: An International Journal
Data clustering by minimizing disconnectivity
Information Sciences: an International Journal
Towards improving subspace data analysis
Proceedings of the 48th Annual Southeast Regional Conference
Inter-dimensional fuzzy clustering
Proceedings of the 48th Annual Southeast Regional Conference
Expert Systems with Applications: An International Journal
Pattern Recognition Letters
Minimum spanning tree based split-and-merge: A hierarchical clustering method
Information Sciences: an International Journal
A Clustering-Driven LDAP Framework
ACM Transactions on the Web (TWEB)
Enhancing grid-density based clustering for high dimensional data
Journal of Systems and Software
A cluster-based approach to web adaptation in context-aware applications
Journal of Web Engineering
DISC: data-intensive similarity measure for categorical data
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Clustering based on compressed data for categorical and mixed attributes
SSPR'06/SPR'06 Proceedings of the 2006 joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Appropriate global ontology construction: a domain-reference-ontology based approach
Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services
Clustering XML documents by structure based on common neighbor
CIS'05 Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part I
An auto-stopped hierarchical clustering algorithm integrating outlier detection algorithm
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Modified adaptive resonance theory network for mixed data based on distance hierarchy
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part IV
A dissimilarity measure for the k-Modes clustering algorithm
Knowledge-Based Systems
XML clustering based on common neighbor
APWeb'06 Proceedings of the 2006 international conference on Advanced Web and Network Technologies, and Applications
Network anomaly detection based on clustering of sequence patterns
ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part II
DHCC: Divisive hierarchical clustering of categorical data
Data Mining and Knowledge Discovery
From Context to Distance: Learning Dissimilarity for Categorical Data Clustering
ACM Transactions on Knowledge Discovery from Data (TKDD)
Improving k-means by outlier removal
SCIA'05 Proceedings of the 14th Scandinavian conference on Image Analysis
Dynamic pattern mining: an incremental data clustering approach
Journal on Data Semantics II
Clustering categorical data using coverage density
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Partition-Based block matching of large class hierarchies
ASWC'06 Proceedings of the First Asian conference on The Semantic Web
Hierarchical clustering algorithm with combined criteria for large and complex similarity data
International Journal of Knowledge Engineering and Soft Data Paradigms
OTM'05 Proceedings of the 2005 OTM Confederated international conference on On the Move to Meaningful Internet Systems: CoopIS, COA, and ODBASE - Volume Part II
Interface tailoring by exploiting temporality of attributes for small screens
DNIS'10 Proceedings of the 6th international conference on Databases in Networked Information Systems
KIDBSCAN: a new efficient data clustering algorithm
ICAISC'06 Proceedings of the 8th international conference on Artificial Intelligence and Soft Computing
An efficient similarity measure for clustering of categorical sequences
AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Clustering categorical data using qualified nearest neighbors selection model
AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
MCS'10 Proceedings of the 9th international conference on Multiple Classifier Systems
Clustering of heterogeneously typed data with soft computing - a case study
MICAI'11 Proceedings of the 10th international conference on Artificial Intelligence: advances in Soft Computing - Volume Part II
Temporality-based user interface design approaches for desktop and small screen environment
International Journal of Computational Science and Engineering
SpaGRID: a spatial grid framework for high dimensional medical databases
HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part I
A new hierarchical clustering algorithm
ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part II
Survey on particle swarm optimization based clustering analysis
SIDE'12 Proceedings of the 2012 international conference on Swarm and Evolutionary Computation
Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
Knowledge augmentation via incremental clustering: new technology for effective knowledge management
International Journal of Business Information Systems
A survey on enhanced subspace clustering
Data Mining and Knowledge Discovery
A novel fuzzy clustering algorithm with between-cluster information for categorical data
Fuzzy Sets and Systems
A data partitioning approach for hierarchical clustering
Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication
Information cascade at group scale
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Optimizing ontology alignment through Memetic Algorithm based on Partial Reference Alignment
Expert Systems with Applications: An International Journal
Survey of Clustering: Algorithms and Applications
International Journal of Information Retrieval Research
A ranking-based algorithm for detection of outliers in categorical data
International Journal of Hybrid Intelligent Systems
Hi-index | 0.01 |
We study clustering algorithms for data with boolean and categorical attributes. We show that traditional clustering algorithms that use distances between points for clustering are not appropriate for boolean and categorical attributes. Instead, we propose a novel concept of links to measure the similarity/proximity between a pair of data points. We develop a robust hierarchical clustering algorithm ROCK that employs links and not distances when merging clusters. Our methods naturally extend to non-metric similarity measures that are relevant in situations where a domain expert/similarity table is the only source of knowledge. In addition to presenting detailed complexity results for ROCK, we also conduct an experimental study with real-life as well as synthetic data sets. Our study shows that ROCK not only generates better quality clusters than traditional algorithms, but also exhibits good scalability properties.