Finding information nebula over large networks

Authors:
Lijun Chang;Jeffrey Xu Yu;Lu Qin;Yuanyuan Zhu;Haixun Wang
Affiliations:
The Chinese University of Hong Kong, Hong Kong, China;The Chinese University of Hong Kong, Hong Kong, China;The Chinese University of Hong Kong, Hong Kong, China;The Chinese University of Hong Kong, Hong Kon, China;Microsoft Research Asia, Beijing, China
Venue:
Proceedings of the 20th ACM international conference on Information and knowledge management
Year:
2011

Citing 24
Cited 0

Combining fuzzy information from multiple systems

Journal of Computer and System Sciences
Topic-sensitive PageRank

Proceedings of the 11th international conference on World Wide Web
SimRank: a measure of structural-context similarity

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Optimal aggregation algorithms for middleware

Journal of Computer and System Sciences - Special issu on PODS 2001
The link prediction problem for social networks

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Fast discovery of connection subgraphs

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Scaling link-based similarity search

WWW '05 Proceedings of the 14th international conference on World Wide Web
Center-piece subgraphs: problem definition and fast solutions

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Yago: a core of semantic knowledge

Proceedings of the 16th international conference on World Wide Web
Authority-based keyword search in databases

ACM Transactions on Database Systems (TODS)
Keyword proximity search in complex data graphs

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Simrank++: query rewriting through link analysis of the click graph

Proceedings of the VLDB Endowment
Accuracy estimate and optimization techniques for SimRank computation

Proceedings of the VLDB Endowment
Keyword search on external memory data graphs

Proceedings of the VLDB Endowment
STAR: Steiner-Tree Approximation in Relationship Graphs

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Querying Communities in Relational Databases

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
P-Rank: a comprehensive structural similarity measure over information networks

Proceedings of the 18th ACM conference on Information and knowledge management
MING: mining informative entity relationship subgraphs

Proceedings of the 18th ACM conference on Information and knowledge management
DBLP: some lessons learned

Proceedings of the VLDB Endowment
Fast computation of SimRank for static and dynamic information networks

Proceedings of the 13th International Conference on Extending Database Technology
Towards proximity pattern mining in large graphs

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Mining knowledge from databases: an information network analysis approach

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Fast nearest-neighbor search in disk-resident graphs

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Inferring networks of diffusion and influence

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Social and information networks have been extensively studied over years. In this paper, we concentrate ourselves on a large information network that is composed of entities and relationships, where entities are associated with sets of keyword terms (kterms) to specify what they are, and relationships describe the link structure among entities which can be very complex. Our work is motivated but is different from the existing works that find a best subgraph to describe how user-specified entities are connected. We compute information nebula (cloud) which is a set of top-K kterms P that are most correlated to a set of user-specified kterms Q, over a large information network. Our goal is to find how kterms are correlated given the complex information network among entities. The information nebula computing requests us to take all possible kterms into consideration for the top-K kterms selection, and needs to measure the similarity between kterms by considering all possible subgraphs that connect them instead of the best single one. In this work, we compute information nebula using a global structural-context similarity, and our similarity measure is independent of connection subgraphs. To the best of our knowledge, among the link-based similarity methods, none of the existing work considers similarity between two sets of nodes or two kterms. We propose new algorithms to find top-K kterms P for a given set of kterms Q based on the global structural-context similarity, without computing all the similarity scores of kterms in the large information network. We performed extensive performance studies using large real datasets, and confirmed the effectiveness and efficiency of our approach.