Partitioned multi-indexing: bringing order to social search

Authors:
Bahman Bahmani;Ashish Goel
Affiliations:
Stanford University, Stanford, CA, USA;Stanford University, Stanford, CA, USA
Venue:
Proceedings of the 21st international conference on World Wide Web
Year:
2012

Citing 30
Cited 3

An algorithm for finding nearest neighbours in (approximately) constant average time

Pattern Recognition Letters
On the all-pairs-shortest-path problem

STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
A new version of the nearest-neighbour approximating and eliminating search algorithm (AESA) with linear preprocessing time and memory requirements

Pattern Recognition Letters
Randomized algorithms

Randomized algorithms
All-Pairs Almost Shortest Paths

SIAM Journal on Computing
The choice of reference points in best-match file searching

Communications of the ACM
Approximate distance oracles

STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Searching in metric spaces

ACM Computing Surveys (CSUR)
Reachability and distance queries via 2-hop labels

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Exact and Approximate Distances in Graphs - A Survey

ESA '01 Proceedings of the 9th Annual European Symposium on Algorithms
Virtual landmarks for the internet

Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Index-driven similarity search in metric spaces (Survey Article)

ACM Transactions on Database Systems (TODS)
Computing the shortest path: A search meets graph theory

SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Graphs over time: densification laws, shrinking diameters and possible explanations

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
All-pairs shortest paths for unweighted undirected graphs in o(mn) time

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Optimizing web search using social annotations

Proceedings of the 16th international conference on World Wide Web
Efficient search ranking in social networks

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Yes, there is a correlation: - from social networks to personal behavior on the web

Proceedings of the 17th international conference on World Wide Web
Introduction to Information Retrieval

Introduction to Information Retrieval
Efficient network aware search in collaborative tagging sites

Proceedings of the VLDB Endowment
Triangulation and embedding using small sets of beacons

Journal of the ACM (JACM)
Fast shortest path distance estimation in large networks

Proceedings of the 18th ACM conference on Information and knowledge management
Personalized social search based on the user's social network

Proceedings of the 18th ACM conference on Information and knowledge management
A sketch-based distance oracle for web-scale graphs

Proceedings of the third ACM international conference on Web search and data mining
The anatomy of a large-scale social search engine

Proceedings of the 19th international conference on World wide web
Social network document ranking

Proceedings of the 10th annual joint conference on Digital libraries
Data-Intensive Text Processing with MapReduce

Data-Intensive Text Processing with MapReduce
On top-k social web search

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
S4: Distributed Stream Computing Platform

ICDMW '10 Proceedings of the 2010 IEEE International Conference on Data Mining Workshops
SPRINT: ranking search results by paths

Proceedings of the 14th International Conference on Extending Database Technology

Social ranking techniques for the web

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Network-aware search in social tagging applications: instance optimality versus efficiency

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Top-K nearest keyword search on large graphs

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

To answer search queries on a social network rich with user-generated content, it is desirable to give a higher ranking to content that is closer to the individual issuing the query. Queries occur at nodes in the network, documents are also created by nodes in the same network, and the goal is to find the document that matches the query and is closest in network distance to the node issuing the query. In this paper, we present the "Partitioned Multi-Indexing" scheme, which provides an approximate solution to this problem. With m links in the network, after an offline ~O(m) pre-processing time, our scheme allows for social index operations (i.e., social search queries, as well as insertion and deletion of words into and from a document at any node), all in time ~O(1). Further, our scheme can be implemented on open source distributed streaming systems such as Yahoo! S4 or Twitter's Storm so that every social index operation takes ~O(1) processing time and network queries in the worst case, and just two network queries in the common case where the reverse index corresponding to the query keyword is much smaller than the memory available at any distributed compute node. Building on Das Sarma et al.'s approximate distance oracle, the worst-case approximation ratio of our scheme is ~O(1) for undirected networks. Our simulations on the social network Twitter as well as synthetic networks show that in practice, the approximation ratio is actually close to 1 for both directed and undirected networks. We believe that this work is the first demonstration of the feasibility of social search with real-time text updates at large scales.