Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A study of smoothing methods for language models applied to Ad Hoc information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Semantic term matching in axiomatic approaches to information retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic models of ranking novel documents for faceted topic retrieval
Proceedings of the 18th ACM conference on Information and knowledge management
Exploiting query reformulations for web search result diversification
Proceedings of the 19th international conference on World wide web
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Search result diversification for enterprise data
Proceedings of the 20th ACM international conference on Information and knowledge management
Coverage-based search result diversification
Information Retrieval
Mining subtopics from text fragments for a web query
Information Retrieval
Hi-index | 0.00 |
Traditional information retrieval models do not necessarily provide users with optimal search experience because the top ranked documents may contain the same piece of relevant information, i.e., the same subtopic of a query. The goal of search result diversification is to return search results that not only are relevant to the query but also cover different subtopics. Therefore, the subtopic modeling is an important research topic in search result diversification. In this paper, we propose a novel pattern based method to extract subtopics from retrieved documents. The basic idea is to explicitly model a query subtopic as a semantically meaningful text unit in relevant documents. We apply a frequent pattern mining algorithm to efficiently extract these text units (patterns) from retrieved documents. We then model a query subtopic with a single pattern and rank subtopics based on their similarity with the query. These pattern based subtopics are then used to diversify search results.