Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
CACTUS—clustering categorical data using summaries
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Finding generalized projected clusters in high dimensional spaces
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A semi-supervised document clustering technique for information organization
Proceedings of the ninth international conference on Information and knowledge management
Unsupervised Feature Selection Using Feature Similarity
IEEE Transactions on Pattern Analysis and Machine Intelligence
Machine Learning
Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Relational Distance-Based Clustering
ILP '98 Proceedings of the 8th International Workshop on Inductive Logic Programming
An introduction to variable and feature selection
The Journal of Machine Learning Research
CrossMine: Efficient Classification Across Multiple Database Relations
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Feature Selection for Unsupervised Learning
The Journal of Machine Learning Research
Kernels and Distances for Structured Data
Machine Learning
Discover: keyword search in relational databases
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
LinkClus: efficient clustering via heterogeneous semantic links
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
A probabilistic framework for relational clustering
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A rank algebra to support multimedia mining applications
Proceedings of the 8th international workshop on Multimedia data mining: (associated with the ACM SIGKDD 2007)
S-SimRank: Combining Content and Link Information to Cluster Papers Effectively and Efficiently
ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Scalable mining and link analysis across multiple database relations
ACM SIGKDD Explorations Newsletter
Frequent Itemset Mining in Multirelational Databases
ISMIS '09 Proceedings of the 18th International Symposium on Foundations of Intelligent Systems
Mining induced and embedded subtrees in ordered, unordered, and partially-ordered trees
ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
POTMiner: mining ordered, unordered, and partially-ordered trees
Knowledge and Information Systems
A general multi-relational classification approach using feature generation and selection
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
A game theoretic framework for heterogenous information network clustering
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Diversified ranking on large graphs: an optimization viewpoint
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Using trees to mine multirelational databases
Data Mining and Knowledge Discovery
Using force-based graph layout for clustering of relational data
ADBIS'09 Proceedings of the 13th East European conference on Advances in Databases and Information Systems
Conceptual clustering of multi-relational data
ILP'11 Proceedings of the 21st international conference on Inductive Logic Programming
New approach for clustering relational data based on relationship and attribute information
ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part II
Hi-index | 0.00 |
Clustering is an essential data mining task with numerous applications. However, data in most real-life applications are high-dimensional in nature, and the related information often spreads across multiple relations. To ensure effective and efficient high-dimensional, cross-relational clustering, we propose a new approach, called CrossClus, which performs cross-relational clustering with user's guidance. We believe that user's guidance, even likely in very simple forms, could be essential for effective high-dimensional clustering since a user knows well the application requirements and data semantics. CrossClus is carried out as follows: A user specifies a clustering task and selects one or a small set of features pertinent to the task. CrossClus extracts the set of highly relevant features in multiple relations connected via linkages defined in the database schema, evaluates their effectiveness based on user's guidance, and identifies interesting clusters that fit user's needs. This method takes care of both quality in feature extraction and efficiency in clustering. Our comprehensive experiments demonstrate the effectiveness and scalability of this approach.