On term selection for query expansion
Journal of Documentation
Viewing morphology as an inference process
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Stemming algorithms: a case study for detailed evaluation
Journal of the American Society for Information Science - Special issue: evaluation of information retrieval systems
Query expansion using local and global document analysis
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Viewing stemming as recall enhancement
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Fast and effective query refinement
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Corpus-based stemming using cooccurrence of word variants
ACM Transactions on Information Systems (TOIS)
Information Retrieval
Modern Information Retrieval
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
Term conflation for information retrieval
SIGIR '84 Proceedings of the 7th annual international ACM SIGIR conference on Research and development in information retrieval
Using terminological feedback for web search refinement: a log-based study
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Automatic retrieval and clustering of similar words
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
A framework for selective query expansion
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Generating query substitutions
Proceedings of the 15th international conference on World Wide Web
Word normalization and decompounding in mono- and bilingual IR
Information Retrieval
Improving web search ranking by incorporating user behavior information
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Semantic term matching in axiomatic approaches to information retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Mining dependency relations for query expansion in passage retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A unified and discriminative model for query refinement
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Mining term association patterns from search logs for effective query reformulation
Proceedings of the 17th ACM conference on Information and knowledge management
Online expansion of rare queries for sponsored search
Proceedings of the 18th international conference on World wide web
Current research issues and trends in non-English Web searching
Information Retrieval
Query reformulation using anchor text
Proceedings of the third ACM international conference on Web search and data mining
How good is a span of terms?: exploiting proximity to improve web retrieval
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Modeling reformulation using passage analysis
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A novel corpus-based stemming algorithm using co-occurrence statistics
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
An unsupervised method to improve Spanish stemmer
NLDB'11 Proceedings of the 16th international conference on Natural language processing and information systems
Coreference aware web object retrieval
Proceedings of the 20th ACM international conference on Information and knowledge management
Using query log and social tagging to refine queries based on latent topics
Proceedings of the 20th ACM international conference on Information and knowledge management
A fuzzy ranking approach for improving search results in Turkish as an agglutinative language
Expert Systems with Applications: An International Journal
Characterizing web content, user interests, and search behavior by reading level and topic
Proceedings of the fifth ACM international conference on Web search and data mining
Effective query formulation with multiple information sources
Proceedings of the fifth ACM international conference on Web search and data mining
Natural language technology and query expansion: issues, state-of-the-art and perspectives
Journal of Intelligent Information Systems
Adaptive query suggestion for difficult queries
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Automatic term mismatch diagnosis for selective query expansion
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Generating reformulation trees for complex queries
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Modeling higher-order term dependencies in information retrieval using query hypergraphs
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Domain dependent query reformulation for web search
Proceedings of the 21st ACM international conference on Information and knowledge management
Modeling reformulation using query distributions
ACM Transactions on Information Systems (TOIS)
Effective and Robust Query-Based Stemming
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
Traditionally, stemming has been applied to Information Retrieval tasks by transforming words in documents to the their root form before indexing, and applying a similar transformation to query terms. Although it increases recall, this naive strategy does not work well for Web Search since it lowers precision and requires a significant amount of additional computation. In this paper, we propose a context sensitive stemming method that addresses these two issues. Two unique properties make our approach feasible for Web Search. First, based on statistical language modeling, we perform context sensitive analysis on the query side. We accurately predict which of its morphological variants is useful to expand a query term with before submitting the query to the search engine. This dramatically reduces the number of bad expansions, which in turn reduces the cost of additional computation and improves the precision at the same time. Second, our approach performs a context sensitive document matching for those expanded variants. This conservative strategy serves as a safeguard against spurious stemming, and it turns out to be very important for improving precision. Using word pluralization handling as an example of our stemming approach, our experiments on a major Web search engine show that stemming only 29% of the query traffic, we can improve relevance as measured by average Discounted Cumulative Gain (DCG5) by 6.1% on these queriesand 1.8% over all query traffic.