RankClus: integrating clustering with ranking for heterogeneous information network analysis

Authors:
Yizhou Sun;Jiawei Han;Peixiang Zhao;Zhijun Yin;Hong Cheng;Tianyi Wu
Affiliations:
University of Illinois at Urbana Champaign;University of Illinois at Urbana Champaign;University of Illinois at Urbana Champaign;University of Illinois at Urbana Champaign;The Chinese University of Hong Kong;University of Illinois at Urbana Champaign
Venue:
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Year:
2009

Citing 11
Cited 41

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment

Journal of the ACM (JACM)
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
SimRank: a measure of structural-context similarity

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Knowledge Discovery from Transportation Network Data

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Object-level ranking: bringing order to Web objects

WWW '05 Proceedings of the 14th international conference on World Wide Web
LinkClus: efficient clustering via heterogeneous semantic links

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
A tutorial on spectral clustering

Statistics and Computing
Combating web spam with trustrank

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Integrative construction and analysis of condition-specific biological networks

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
The future of citeseer: citeseerx

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases

Ranking-based clustering of heterogeneous information networks with star network schema

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Rank-aware clustering of structured datasets

Proceedings of the 18th ACM conference on Information and knowledge management
Graph OLAP: a multi-dimensional framework for graph data analysis

Knowledge and Information Systems
iNextCube: information network-enhanced text cube

Proceedings of the VLDB Endowment
Graph clustering based on structural/attribute similarities

Proceedings of the VLDB Endowment
Mining Heterogeneous Information Networks by Exploring the Power of Links

DS '09 Proceedings of the 12th International Conference on Discovery Science
Subspace Discovery for Promotion: A Cell Clustering Approach

DS '09 Proceedings of the 12th International Conference on Discovery Science
Region-based online promotion analysis

Proceedings of the 13th International Conference on Extending Database Technology
Mining knowledge from databases: an information network analysis approach

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Community evolution detection in dynamic heterogeneous information networks

Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Ranking of wireless access networks for providing QoS in heterogeneous environment

Proceedings of the 8th ACM international workshop on Mobility management and wireless access
Mining topic-level influence in heterogeneous networks

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Simultaneous ranking and clustering of sentences: a reinforcement approach to multi-document summarization

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Clustering Large Attributed Graphs: A Balance between Structural and Attribute Similarities

ACM Transactions on Knowledge Discovery from Data (TKDD)
Making interval-based clustering rank-aware

Proceedings of the 14th International Conference on Extending Database Technology
Trust analysis with clustering

Proceedings of the 20th international conference companion on World wide web
A game theoretic framework for heterogenous information network clustering

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Construction and analysis of web-based computer science information networks

RSFDGrC'11 Proceedings of the 13th international conference on Rough sets, fuzzy sets, data mining and granular computing
Mining diversity on social media networks

Multimedia Tools and Applications
PAV: A novel model for ranking heterogeneous objects in bibliographic information networks

Expert Systems with Applications: An International Journal
Author name disambiguation for ranking and clustering pubmed data using netclus

AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
Relevance search in heterogeneous networks

Proceedings of the 15th International Conference on Extending Database Technology
RankCompete: Simultaneous ranking and clustering of information networks

Neurocomputing
Mining heterogeneous information networks: the next frontier

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Integrating meta-path selection with user-guided object clustering in heterogeneous information networks

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining knowledge from interconnected data: a heterogeneous information network analysis approach

Proceedings of the VLDB Endowment
Rank-directed layout of UML class diagrams

Proceedings of the First International Workshop on Software Mining
A framework and a language for on-line analytical processing on graphs

WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Pareto distance for multi-layer network analysis

SBP'13 Proceedings of the 6th international conference on Social Computing, Behavioral-Cultural Modeling and Prediction
Community evolution detection in time-evolving information networks

Proceedings of the Joint EDBT/ICDT 2013 Workshops
Research-insight: providing insight on research by publication network analysis

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Mining heterogeneous information networks: a structural analysis approach

ACM SIGKDD Explorations Newsletter
Multi-label classification by mining label and instance correlations from heterogeneous information networks

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Social influence based clustering of heterogeneous information networks

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Community detection by popularity based models for authored networked data

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
PathSelClus: Integrating Meta-Path Selection with User-Guided Object Clustering in Heterogeneous Information Networks

ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on ACM SIGKDD 2012
Discovering influential authors in heterogeneous academic networks by a co-ranking method

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
DynamicNet: an effective and efficient algorithm for supporting community evolution detection in time-evolving information networks

Proceedings of the 17th International Database Engineering & Applications Symposium
MedRank: discovering influential medical treatments from literature by information network analysis

ADC '13 Proceedings of the Twenty-Fourth Australasian Database Conference - Volume 137
A Local Method for ObjectRank Estimation

Proceedings of International Conference on Information Integration and Web-based Applications & Services
iHypR: Prominence ranking in networks of collaborations with hyperedges1

ACM Transactions on Knowledge Discovery from Data (TKDD)

Quantified Score

Hi-index	0.00

Visualization

Abstract

As information networks become ubiquitous, extracting knowledge from information networks has become an important task. Both ranking and clustering can provide overall views on information network data, and each has been a hot topic by itself. However, ranking objects globally without considering which clusters they belong to often leads to dumb results, e.g., ranking database and computer architecture conferences together may not make much sense. Similarly, clustering a huge number of objects (e.g., thousands of authors) in one huge cluster without distinction is dull as well. In this paper, we address the problem of generating clusters for a specified type of objects, as well as ranking information for all types of objects based on these clusters in a multi-typed (i.e., heterogeneous) information network. A novel clustering framework called RankClus is proposed that directly generates clusters integrated with ranking. Based on initial K clusters, ranking is applied separately, which serves as a good measure for each cluster. Then, we use a mixture model to decompose each object into a K-dimensional vector, where each dimension is a component coefficient with respect to a cluster, which is measured by rank distribution. Objects then are reassigned to the nearest cluster under the new measure space to improve clustering. As a result, quality of clustering and ranking are mutually enhanced, which means that the clusters are getting more accurate and the ranking is getting more meaningful. Such a progressive refinement process iterates until little change can be made. Our experiment results show that RankClus can generate more accurate clusters and in a more efficient way than the state-of-the-art link-based clustering methods. Moreover, the clustering results with ranks can provide more informative views of data compared with traditional clustering.