Learning query and document similarities from click-through bipartite graph with metadata

Authors:
Wei Wu;Hang Li;Jun Xu
Affiliations:
Microsoft Research Asia, Beijing, China;Noah's Ark Lab of Huawei Technologies, Hong Kong, China;Noah's Ark Lab of Huawei Technologies, Hong Kong, China
Venue:
Proceedings of the sixth ACM international conference on Web search and data mining
Year:
2013

Citing 24
Cited 2

A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
IR evaluation methods for retrieving highly relevant documents

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Agglomerative clustering of a search engine query log

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Query clustering using user logs

ACM Transactions on Information Systems (TOIS)
SimRank: a measure of structural-context similarity

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Toward a unification of text and link analysis

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A study of smoothing methods for language models applied to information retrieval

ACM Transactions on Information Systems (TOIS)
SimFusion: measuring similarity using unified relationship matrix

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Canonical Correlation Analysis: An Overview with Application to Learning Methods

Neural Computation
Mining search engine query logs for query recommendation

Proceedings of the 15th international conference on World Wide Web
Latent semantic analysis for multiple-type interrelated data objects

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Random walks on the click graph

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Extracting semantic relations from query logs

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Partial least squares regression for graph mining

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Simrank++: query rewriting through link analysis of the click graph

Proceedings of the VLDB Endowment
Learning latent semantic relations from clickthrough data for query suggestion

Proceedings of the 17th ACM conference on Information and knowledge management
Online expansion of rare queries for sponsored search

Proceedings of the 18th international conference on World wide web
A generalized Co-HITS algorithm and its application to bipartite graphs

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning to Rank for Information Retrieval

Foundations and Trends in Information Retrieval
Learning similarity function for rare queries

Proceedings of the fourth ACM international conference on Web search and data mining
Learning relevance from heterogeneous social network and its application in online targeting
Learning a Robust Relevance Model for Search Using Kernel Methods

The Journal of Machine Learning Research
Overview and recent advances in partial least squares

SLSFS'05 Proceedings of the 2005 international conference on Subspace, Latent Structure and Feature Selection
A Unifying Discussion of Correlation Analysis for Complex Random Vectors

IEEE Transactions on Signal Processing

Heterogeneous graph-based intent learning with queries, web pages and Wikipedia concepts

Proceedings of the 7th ACM international conference on Web search and data mining
Reduce and aggregate: similarity ranking in multi-categorical bipartite graphs

Proceedings of the 23rd international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider learning query and document similarities from a click-through bipartite graph with metadata on the nodes. The metadata contains multiple types of features of queries and documents. We aim to leverage both the click-through bipartite graph and the features to learn query-document, document-document, and query-query similarities. The challenges include how to model and learn the similarity functions based on the graph data. We propose solving the problems in a principled way. Specifically, we use two different linear mappings to project the queries and documents in two different feature spaces into the same latent space, and take the dot product in the latent space as their similarity. Query-query and document-document similarities can also be naturally defined as dot products in the latent space. We formalize the learning of similarity functions as learning of the mappings that maximize the similarities of the observed query-document pairs on the enriched click-through bipartite graph. When queries and documents have multiple types of features, the similarity function is defined as a linear combination of multiple similarity functions, each based on one type of features. We further solve the learning problem by using a new technique called Multi-view Partial Least Squares (M-PLS). The advantages include the global optimum which can be obtained through Singular Value Decomposition (SVD) and the capability of finding high quality similar queries. We conducted large scale experiments on enterprise search data and web search data. The experimental results on relevance ranking and similar query finding demonstrate that the proposed method works significantly better than the baseline methods.