High performance query expansion using adaptive co-training

Authors:
Jimmy Xiangji Huang;Jun Miao;Ben He
Affiliations:
Information Retrieval and Knowledge Management Research Lab, School of Information Technology, York University, Toronto, Canada;Information Retrieval and Knowledge Management Research Lab, School of Information Technology, York University, Toronto, Canada;Information Retrieval and Knowledge Management Research Lab, School of Information Technology, York University, Toronto, Canada
Venue:
Information Processing and Management: an International Journal
Year:
2013

Citing 32
Cited 1

On term selection for query expansion

Journal of Documentation
Query expansion using local and global document analysis

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Improving automatic query expansion

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Fast training of support vector machines using sequential minimal optimization

Advances in kernel methods
Improving the effectiveness of information retrieval with local context analysis

ACM Transactions on Information Systems (TOIS)
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Learning to construct knowledge bases from the World Wide Web

Artificial Intelligence - Special issue on Intelligent internet systems
An information-theoretic approach to automatic query expansion

ACM Transactions on Information Systems (TOIS)
Relevance based language models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Using LSI for text classification in the presence of background text

Proceedings of the tenth international conference on Information and knowledge management
Model-based feedback in the language modeling approach to information retrieval

Proceedings of the tenth international conference on Information and knowledge management
Improving retrieval feedback with multiple term-ranking function combination

ACM Transactions on Information Systems (TOIS)
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Enhancing Supervised Learning with Unlabeled Data

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Combining clustering and co-training to enhance text classification using unlabelled data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Co-training with a Single Natural Feature Set Applied to Email Classification

WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
Simple BM25 extension to multiple weighted fields

Proceedings of the thirteenth ACM international conference on Information and knowledge management
A multi-system analysis of document and term selection for blind feedback

Proceedings of the thirteenth ACM international conference on Information and knowledge management
A statistical method for system evaluation using incomplete judgments

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A document-centric approach to static index pruning in text retrieval systems

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Applying Data Mining to Pseudo-Relevance Feedback for High Performance Text Retrieval

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
A bayesian logistic regression model for active relevance feedback

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A cluster-based resampling method for pseudo-relevance feedback

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Selecting good expansion terms for pseudo-relevance feedback

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Studying Query Expansion Effectiveness

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Measuring classifier performance: a coherent alternative to the area under the ROC curve

Machine Learning
Reducing the risk of query expansion via robust constrained optimization

Proceedings of the 18th ACM conference on Information and knowledge management
Finding good feedback documents

Proceedings of the 18th ACM conference on Information and knowledge management
Finding a good query-related topic for boosting pseudo-relevance feedback

Journal of the American Society for Information Science and Technology
Predicting document effectiveness in pseudo relevance feedback

Proceedings of the 20th ACM international conference on Information and knowledge management
Proximity-based rocchio's model for pseudo relevance

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

Clustering-based transduction for learning a ranking model with limited human labels

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

The quality of feedback documents is crucial to the effectiveness of query expansion (QE) in ad hoc retrieval. Recently, machine learning methods have been adopted to tackle this issue by training classifiers from feedback documents. However, the lack of proper training data has prevented these methods from selecting good feedback documents. In this paper, we propose a new method, called AdapCOT, which applies co-training in an adaptive manner to select feedback documents for boosting QE's effectiveness. Co-training is an effective technique for classification over limited training data, which is particularly suitable for selecting feedback documents. The proposed AdapCOT method makes use of a small set of training documents, and labels the feedback documents according to their quality through an iterative process. Two exclusive sets of term-based features are selected to train the classifiers. Finally, QE is performed on the labeled positive documents. Our extensive experiments show that the proposed method improves QE's effectiveness, and outperforms strong baselines on various standard TREC collections.