Combining fuzzy information from multiple systems
Journal of Computer and System Sciences
Proceedings of the 11th international conference on World Wide Web
SimRank: a measure of structural-context similarity
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Optimal aggregation algorithms for middleware
Journal of Computer and System Sciences - Special issu on PODS 2001
The link prediction problem for social networks
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Fast discovery of connection subgraphs
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Scaling link-based similarity search
WWW '05 Proceedings of the 14th international conference on World Wide Web
Center-piece subgraphs: problem definition and fast solutions
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Yago: a core of semantic knowledge
Proceedings of the 16th international conference on World Wide Web
Authority-based keyword search in databases
ACM Transactions on Database Systems (TODS)
Keyword proximity search in complex data graphs
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Simrank++: query rewriting through link analysis of the click graph
Proceedings of the VLDB Endowment
Accuracy estimate and optimization techniques for SimRank computation
Proceedings of the VLDB Endowment
Keyword search on external memory data graphs
Proceedings of the VLDB Endowment
STAR: Steiner-Tree Approximation in Relationship Graphs
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Querying Communities in Relational Databases
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
P-Rank: a comprehensive structural similarity measure over information networks
Proceedings of the 18th ACM conference on Information and knowledge management
MING: mining informative entity relationship subgraphs
Proceedings of the 18th ACM conference on Information and knowledge management
Proceedings of the VLDB Endowment
Fast computation of SimRank for static and dynamic information networks
Proceedings of the 13th International Conference on Extending Database Technology
Towards proximity pattern mining in large graphs
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Mining knowledge from databases: an information network analysis approach
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Fast nearest-neighbor search in disk-resident graphs
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Inferring networks of diffusion and influence
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
Social and information networks have been extensively studied over years. In this paper, we concentrate ourselves on a large information network that is composed of entities and relationships, where entities are associated with sets of keyword terms (kterms) to specify what they are, and relationships describe the link structure among entities which can be very complex. Our work is motivated but is different from the existing works that find a best subgraph to describe how user-specified entities are connected. We compute information nebula (cloud) which is a set of top-K kterms P that are most correlated to a set of user-specified kterms Q, over a large information network. Our goal is to find how kterms are correlated given the complex information network among entities. The information nebula computing requests us to take all possible kterms into consideration for the top-K kterms selection, and needs to measure the similarity between kterms by considering all possible subgraphs that connect them instead of the best single one. In this work, we compute information nebula using a global structural-context similarity, and our similarity measure is independent of connection subgraphs. To the best of our knowledge, among the link-based similarity methods, none of the existing work considers similarity between two sets of nodes or two kterms. We propose new algorithms to find top-K kterms P for a given set of kterms Q based on the global structural-context similarity, without computing all the similarity scores of kterms in the large information network. We performed extensive performance studies using large real datasets, and confirmed the effectiveness and efficiency of our approach.