Smoothing document language model with local word graph

Authors:
Yunping Huang;Le Sun;Jian-Yun Nie
Affiliations:
Chinese Academy of Sciences, Beijing, China;Chinese Academy of Sciences, Beijing, China;University of Montreal, Montreal, Canada
Venue:
Proceedings of the 18th ACM conference on Information and knowledge management
Year:
2009

Citing 7
Cited 2

A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Document language models, query models, and risk minimization for information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Model-based feedback in the language modeling approach to information retrieval

Proceedings of the tenth international conference on Information and knowledge management
Integrating word relationships into language models

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Language model information retrieval with document expansion

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Random walk term weighting for information retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A general optimization framework for smoothing language models on graph structures

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

Document expansion based on WordNet for robust IR

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
A document is known by the company it keeps: neighborhood consensus for short text categorization

Language Resources and Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Smoothing document model with word graph is a new and effective method in information retrieval. Word graph can naturally incorporate the dependency between the words; random walk algorithm based on the graph can be used to estimate the weight of each vertex. In this paper, we present a new way to construct a local word graph for smoothing document model, which exploits the document's k nearest neighbors: the vertices represent the words in the document and its k nearest neighbors, and the weights of the edges are estimated through word co-occurrence in the local document set. We argue that word graph is a key factor to the performance in graph-based smoothing method. By using the local document set, we can obtain a document specific word graph, and achieve better retrieval performance. Experimental results on three TREC collections show that our proposed approach is effective.