Dynamic personalized pagerank in entity-relation graphs

Authors:
Soumen Chakrabarti
Affiliations:
IIT Bombay, Mumbai, India
Venue:
Proceedings of the 16th international conference on World Wide Web
Year:
2007

Citing 15
Cited 43

Bayesian inference networks and spreading activation in hypertext systems

Information Processing and Management: an International Journal
An extended vector-processing scheme for searching information in hypertext systems

Information Processing and Management: an International Journal
The quest for correct information on the Web: hyper search engines

Selected papers from the sixth international conference on World Wide Web
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Topic-sensitive PageRank

Proceedings of the 11th international conference on World Wide Web
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Scaling personalized web search

WWW '03 Proceedings of the 12th international conference on World Wide Web
Adaptive on-line page importance computation

WWW '03 Proceedings of the 12th international conference on World Wide Web
Local methods for estimating pagerank values

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Rank-Stability and Rank-Similarity of Link-Based Web Ranking Algorithms in Authority-Connected Graphs

Information Retrieval
Object-level ranking: bringing order to Web objects

WWW '05 Proceedings of the 14th international conference on World Wide Web
SPIN: searching personal information networks

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Objectrank: authority-based keyword search in databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Learning parameters in entity relationship graphs from ranking preferences

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases

Mining, indexing, and searching for textual chemical molecule information on the web

Proceedings of the 17th international conference on World Wide Web
Fast algorithms for topk personalized pagerank queries

Proceedings of the 17th international conference on World Wide Web
Fast incremental proximity search in large graphs

Proceedings of the 25th international conference on Machine learning
YAGO: A Large Ontology from Wikipedia and WordNet

Web Semantics: Science, Services and Agents on the World Wide Web
Database and information-retrieval methods for knowledge discovery

Communications of the ACM - A Direct Path to Dependable Software
Fast dynamic reranking in large graphs

Proceedings of the 18th international conference on World wide web
Exploiting web search engines to search structured databases

Proceedings of the 18th international conference on World wide web
A Vector Space Model for Ranking Entities and Its Application to Expert Search

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Analysis of an on-line algorithm for solving large Markov chains

Proceedings of the 3rd International Conference on Performance Evaluation Methodologies and Tools
Social search and discovery using a unified approach

Proceedings of the 20th ACM conference on Hypertext and hypermedia
Scalable proximity estimation and link prediction in online social networks

Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference
MING: mining informative entity relationship subgraphs

Proceedings of the 18th ACM conference on Information and knowledge management
From information to knowledge: harvesting entities and relationships from web sources

Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fast nearest-neighbor search in disk-resident graphs

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Focused crawling using navigational rank

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Improving graph-walk-based similarity with reranking: Case studies for personal information management

ACM Transactions on Information Systems (TOIS)
An effective 3-in-1 keyword search method over heterogeneous data sources

Information Systems
Retrieving top-k prestige-based relevant spatial web objects

Proceedings of the VLDB Endowment
Providing built-in keyword search capabilities in RDBMS

The VLDB Journal — The International Journal on Very Large Data Bases
Context-sensitive ranking for document retrieval

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Index design and query processing for graph conductance search

The VLDB Journal — The International Journal on Very Large Data Bases
Quick detection of top-k personalized pagerank lists

WAW'11 Proceedings of the 8th international conference on Algorithms and models for the web graph
A methodology for mining document-enriched heterogeneous information networks

DS'11 Proceedings of the 14th international conference on Discovery science
Exploring the corporate ecosystem with a semi-supervised entity graph

Proceedings of the 20th ACM international conference on Information and knowledge management
Expertise ranking using activity and contextual link measures

Data & Knowledge Engineering
pest: Fast approximate keyword search in semantic data using eigenvector-based term propagation

Information Systems
Graph-based term weighting for information retrieval

Information Retrieval
Chapter 3: search for knowledge

Search Computing
iMapReduce: A Distributed Computing Framework for Iterative Computation

Journal of Grid Computing
MaskIt: privately releasing user context streams for personalized mobile applications

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Query-Independent learning to rank for RDF entity search

ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
LBSNRank: personalized pagerank on location-based social networks

Proceedings of the 2012 ACM Conference on Ubiquitous Computing
User guided entity similarity search using meta-path selection in heterogeneous information networks

Proceedings of the 21st ACM international conference on Information and knowledge management
Impact neighborhood indexing (INI) in diffusion graphs

Proceedings of the 21st ACM international conference on Information and knowledge management
Ontology querying support in semantic annotation process

PRICAI'12 Proceedings of the 12th Pacific Rim international conference on Trends in Artificial Intelligence
Programming with personalized pagerank: a locally groundable first-order probabilistic logic

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Incremental and accuracy-aware personalized pagerank through scheduled approximation

Proceedings of the VLDB Endowment
On building entity recommender systems using user click log and freebase knowledge

Proceedings of the 7th ACM international conference on Web search and data mining
Ranking in heterogeneous social media

Proceedings of the 7th ACM international conference on Web search and data mining
Personalized entity recommendation: a heterogeneous information network approach

Proceedings of the 7th ACM international conference on Web search and data mining
On the embeddability of random walk distances

Proceedings of the VLDB Endowment
Social-oriented visual image search

Computer Vision and Image Understanding
A multi-criteria ranking framework for partner selection in scientific collaboration environments

Decision Support Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Extractors and taggers turn unstructured text into entity-relation(ER) graphs where nodes are entities (email, paper, person,conference, company) and edges are relations (wrote, cited,works-for). Typed proximity search of the form type=personNEAR company~"IBM", paper~"XML" is an increasingly usefulsearch paradigm in ER graphs. Proximity search implementations either perform a Pagerank-like computation at query time, which is slow, or precompute, store and combine per-word Pageranks, which can be very expensive in terms of preprocessing time and space. We present HubRank, a new system for fast, dynamic, space-efficient proximity searches in ER graphs. During preprocessing, HubRank computesand indexes certain "sketchy" random walk fingerprints for a small fraction of nodes, carefully chosen using query log statistics. At query time, a small "active" subgraph is identified, bordered bynodes with indexed fingerprints. These fingerprints are adaptively loaded to various resolutions to form approximate personalized Pagerank vectors (PPVs). PPVs at remaining active nodes are now computed iteratively. We report on experiments with CiteSeer's ER graph and millions of real Cite Seer queries. Some representative numbers follow. On our testbed, HubRank preprocesses and indexes 52 times faster than whole-vocabulary PPV computation. A text index occupies 56 MB. Whole-vocabulary PPVs would consume 102GB. If PPVs are truncated to 56 MB, precision compared to true Pagerank drops to 0.55; incontrast, HubRank has precision 0.91 at 63MB. HubRank's average querytime is 200-300 milliseconds; query-time Pagerank computation takes 11 seconds on average.