Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
On power-law relationships of the Internet topology
Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
Biclustering of Expression Data
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Relational Distance-Based Clustering
ILP '98 Proceedings of the 8th International Workshop on Inductive Logic Programming
SimRank: a measure of structural-context similarity
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Top.K Frequent Closed Patterns without Minimum Support
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
ReCoM: reinforcement clustering of multi-type interrelated data objects
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Information-theoretic co-clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
CLOSET+: searching for the best strategies for mining frequent closed itemsets
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Fully automatic cross-associations
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Scaling link-based similarity search
WWW '05 Proceedings of the 14th international conference on World Wide Web
Cross-relational clustering with user's guidance
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
Multi-way distributional clustering via pairwise interactions
ICML '05 Proceedings of the 22nd international conference on Machine learning
A probabilistic framework for relational clustering
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering as an approach to support the automatic definition of semantic hyperlinks
Proceedings of the eighteenth conference on Hypertext and hypermedia
Diva: a variance-based clustering approach for multi-type relational data
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Structure-based inference of xml similarity for fuzzy duplicate detection
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
DataScope: viewing database contents in Google Maps' way
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
BibNetMiner: mining bibliographic information networks
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
S-SimRank: Combining Content and Link Information to Cluster Papers Effectively and Efficiently
ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Scaling up duplicate detection in graph data
Proceedings of the 17th ACM conference on Information and knowledge management
RankClus: integrating clustering with ranking for heterogeneous information network analysis
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Mining Research Communities in Bibliographical Data
Advances in Web Mining and Web Usage Analysis
An Adaptive Method for the Efficient Similarity Calculation
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Using Link-Based Content Analysis to Measure Document Similarity Effectively
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Exploiting the Block Structure of Link Graph for Efficient Similarity Computation
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Scalable mining and link analysis across multiple database relations
ACM SIGKDD Explorations Newsletter
Exploiting Domain Knowledge by Automated Taxonomy Generation in Recommender Systems
EC-Web 2009 Proceedings of the 10th International Conference on E-Commerce and Web Technologies
Calculating Similarity Efficiently in a Small World
ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
WisColl: Collective wisdom based blog clustering
Information Sciences: an International Journal
P-Rank: a comprehensive structural similarity measure over information networks
Proceedings of the 18th ACM conference on Information and knowledge management
Exploring the power of heuristics and links in multi-relational data mining
ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
A fast two-stage algorithm for computing SimRank and its extensions
WAIM'10 Proceedings of the 2010 international conference on Web-age information management
Approximate entity extraction in temporal databases
World Wide Web
Efficient link-based clustering in a large scaled blog network
Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication
A game theoretic framework for heterogenous information network clustering
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Axiomatic ranking of network role similarity
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Pairwise similarity calculation of information networks
DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
PAV: A novel model for ranking heterogeneous objects in bibliographic information networks
Expert Systems with Applications: An International Journal
Delta-SimRank computing on MapReduce
Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Hierarchical data organization for effective retrieval of similar shaders
Proceedings of the 2012 ACM Research in Applied Computation Symposium
A data partitioning approach for hierarchical clustering
Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication
Towards scalable real-time entity resolution using a similarity-aware inverted index approach
AusDM '08 Proceedings of the 7th Australasian Data Mining Conference - Volume 87
E-rank: A Structural-Based Similarity Measure in Social Networks
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
From Frequent Features to Frequent Social Links
International Journal of Information System Modeling and Design
Scalable and axiomatic ranking of network role similarity
ACM Transactions on Knowledge Discovery from Data (TKDD) - Casin special issue
Hi-index | 0.00 |
Data objects in a relational database are cross-linked with each other via multi-typed links. Links contain rich semantic information that may indicate important relationships among objects. Most current clustering methods rely only on the properties that belong to the objects per se. However, the similarities between objects are often indicated by the links, and desirable clusters cannot be generated using only the properties of objects.In this paper we explore linkage-based clustering, in which the similarity between two objects is measured based on the similarities between the objects linked with them. In comparison with a previous study (SimRank) that computes links recursively on all pairs of objects, we take advantage of the power law distribution of links, and develop a hierarchical structure called SimTree to represent similarities in multi-granularity manner. This method avoids the high cost of computing and storing pairwise similarities but still thoroughly explore relationships among objects. An efficient algorithm is proposed to compute similarities between objects by avoiding pairwise similarity computations through merging computations that go through the same branches in the SimTree. Experiments show the proposed approach achieves high efficiency, scalability, and accuracy in clustering multi-typed linked objects.