Discovering relevant features for effective query formulation

Authors:
Luepol Pipanmaekaporn;Yuefeng Li
Affiliations:
School of Electrical Engineering and Computer Science, Queensland University of Technology, Brisbane, Australia;School of Electrical Engineering and Computer Science, Queensland University of Technology, Brisbane, Australia
Venue:
IRFC'12 Proceedings of the 5th conference on Multidisciplinary Information Retrieval
Year:
2012

Citing 24
Cited 0

The effect of adding relevance information in a relevance feedback environment

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Experimentation as a way of life: Okapi at TREC

Information Processing and Management: an International Journal - The sixth text REtrieval conference (TREC-6)
Evaluating evaluation measure stability

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing

Communications of the ACM
Applying summarization techniques for term selection in relevance feedback

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Feature Engineering for Text Classification

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Building a filtering test collection for TREC 2002

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Building Text Classifiers Using Positive and Unlabeled Examples

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Discriminative models for information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic Pattern-Taxonomy Extraction for Web Mining

WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
A Markov random field model for term dependencies

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Sequential patterns for text categorization

Intelligent Data Analysis
Extending the single words-based document model: a comparison of bigrams and 2-itemsets

Proceedings of the 2006 ACM symposium on Document engineering
Deploying Approaches for Pattern Refinement in Text Mining

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
A cluster-based resampling method for pseudo-relevance feedback

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Selecting good expansion terms for pseudo-relevance feedback

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A "Bag" or a "Window" of Words for Information Filtering?

SETN '08 Proceedings of the 5th Hellenic conference on Artificial Intelligence: Theories, Models and Applications
Selecting Effective Terms for Query Formulation

AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Learning concept importance using a weighted dependence model

Proceedings of the third ACM international conference on Web search and data mining
Intelligent user profiling

Artificial intelligence
Positional relevance model for pseudo-relevance feedback

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Effective Pattern Discovery for Text Mining

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

The quality of discovered features in relevance feedback (RF) is the key issue for effective search query. Most existing feedback methods do not carefully address the issue of selecting features for noise reduction. As a result, exracted noisy features can easily contribute to undesirable effectiveness. In this paper, we propose a novel feature extraction method for query formulation. This method first extract term association patterns in RF as knowledge for feature extraction. Negative RF is then used to improve the quality of the discovered knowledge. A novel information filtering (IF) model is developed to evaluate the proposed method. The experimental results conducted on Reuters Corpus Volume 1 and TREC topics confirm that the proposed model achieved encouraging performance compared to state-of-the-art IF models.