Accurate and scalable nearest neighbors in large networks based on effective importance

Authors:
Petko Bogdanov;Ambuj Singh
Affiliations:
University of California Santa Barbara, Santa Barbara, CA, USA;University of California Santa Barbara, Santa Barbara, CA, USA
Venue:
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Year:
2013

Citing 29
Cited 0

Introduction to algorithms

Introduction to algorithms
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

SIAM Journal on Scientific Computing
SimRank: a measure of structural-context similarity

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Scaling personalized web search

WWW '03 Proceedings of the 12th international conference on World Wide Web
Linked

Linked
Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search

IEEE Transactions on Knowledge and Data Engineering
Fast discovery of connection subgraphs

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Local methods for estimating pagerank values

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps

Bioinformatics
Cover trees for nearest neighbor

ICML '06 Proceedings of the 23rd international conference on Machine learning
Estimating the global pagerank of web communities

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Measuring and extracting proximity in networks

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Local Graph Partitioning using PageRank Vectors

FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Fast direction-aware proximity for graph mining

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Connectivity structure of bipartite graphs via the KNC-plot

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Scalable network distance browsing in spatial databases

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Fast incremental proximity search in large graphs

Proceedings of the 25th international conference on Machine learning
Growth of the flickr social network

Proceedings of the first workshop on Online social networks
The mixing rate of Markov chains, an isoperimetric inequality, and computing the volume

SFCS '90 Proceedings of the 31st Annual Symposium on Foundations of Computer Science
Measuring Proximity on Graphs with Side Information

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
ApproxRank: Estimating Rank for a Subgraph

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
iPoG: fast interactive proximity querying on graphs

Proceedings of the 18th ACM conference on Information and knowledge management
Detecting sharp drops in PageRank and a simplified local partitioning algorithm

TAMC'07 Proceedings of the 4th international conference on Theory and applications of models of computation
A general graph-based semi-supervised learning with novel class discovery

Neural Computing and Applications
Local graph sparsification for scalable clustering

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Fast nearest neighbor search on road networks

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Graph sketches: sparsification, spanners, and subgraphs

PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
Storytelling in entity networks to support intelligence analysts

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient label propagation for classification on information networks

Proceedings of the Third Symposium on Information and Communication Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Nearest neighbor proximity search in large graphs is an important analysis primitive with a variety of applications in graph data from different domains. We propose a novel proximity measure for weighted graphs called Effective Importance which incorporates multiple paths between nodes and captures the inherent structural clusters within a network. We develop effective bounds on the EI value using a modified small subnetwork around a query node, enabling scalable exact nearest neighbor (NN) search at query time. Our NN search does not require heavy offline analysis or holistic knowledge of the graph, making our method suitable for very large dynamically changing networks or composite network overlays. We employ our NN search algorithm on social, information and biological networks and demonstrate the effectiveness and scalability of the approach. For million-node networks, our method retrieves the exact top 20 neighbors using less than $0.2%$ of the network edges in a fraction of a second on a conventional desktop machine. We also evaluate the effectiveness of our proximity measure and NN search for three applications, namely (i) finding good local clusters, (ii) network sparsification and (iii) prediction of node attributes in information networks. The EI measure and NN search method outperform recent counterparts from the literature in all applications.