Fast query execution for retrieval models based on path-constrained random walks

Authors:
Ni Lao;William W. Cohen
Affiliations:
Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA
Venue:
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2010

Citing 14
Cited 8

Using PageRank to Characterize Web Structure

COCOON '02 Proceedings of the 8th Annual International Conference on Computing and Combinatorics
Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search

IEEE Transactions on Knowledge and Data Engineering
Keyword Searching and Browsing in Databases using BANKS

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Object-level ranking: bringing order to Web objects

WWW '05 Proceedings of the 14th international conference on World Wide Web
Contextual search and name disambiguation in email using graphs

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Fast Random Walk with Restart and Its Applications

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
BLINKS: ranked keyword searches on graphs

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Scalable training of L1-regularized log-linear models

Proceedings of the 24th international conference on Machine learning
Discover: keyword search in relational databases

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Keyword search on external memory data graphs

Proceedings of the VLDB Endowment
Document selection methodologies for efficient and effective learning-to-rank

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Learning graph walk based similarity measures for parsed text

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Learning web page scores by error back-propagation

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Learning parameters in entity relationship graphs from ranking preferences

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases

Ranking objects by following paths in entity-relationship graphs

Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management
Random walk inference and learning in a large scale knowledge base

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Relevance search in heterogeneous networks

Proceedings of the 15th International Conference on Extending Database Technology
Query-driven discovery of semantically similar substructures in heterogeneous networks

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
HeteRecom: a semantic-based recommendation system in heterogeneous networks

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
PathRank: Ranking nodes on a heterogeneous graph for flexible hybrid recommender systems

Expert Systems with Applications: An International Journal
Graph based similarity measures for synonym extraction from parsed text

TextGraphs-7 '12 Workshop Proceedings of TextGraphs-7 on Graph-based Methods for Natural Language Processing
Recommendation in heterogeneous information networks with implicit user feedback

Proceedings of the 7th ACM conference on Recommender systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many recommendation and retrieval tasks can be represented as proximity queries on a labeled directed graph, with typed nodes representing documents, terms, and metadata, and labeled edges representing the relationships between them. Recent work has shown that the accuracy of the widely-used random-walk-based proximity measures can be improved by supervised learning - in particular, one especially effective learning technique is based on Path-Constrained Random Walks (PCRW), in which similarity is defined by a learned combination of constrained random walkers, each constrained to follow only a particular sequence of edge labels away from the query nodes. The PCRW based method significantly outperformed unsupervised random walk based queries, and models with learned edge weights. Unfortunately, PCRW query systems are expensive to evaluate. In this study we evaluate the use of approximations to the computation of the PCRW distributions, including fingerprinting, particle filtering, and truncation strategies. In experiments on several recommendation and retrieval problems using two large scientific publications corpora we show speedups of factors of 2 to 100 with little loss in accuracy.