SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
A framework for determining necessary query set sizes to evaluate web search effectiveness
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Ranking robustness: a novel framework to predict query performance
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Measuring ranked list robustness for query performance prediction
Knowledge and Information Systems
Predicting Neighbor Goodness in Collaborative Filtering
FQAS '09 Proceedings of the 8th International Conference on Flexible Query Answering Systems
Learning to judge image search results
MM '11 Proceedings of the 19th ACM international conference on Multimedia
An analysis on topic features and difficulties based on web navigational retrieval experiments
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Combining pre-retrieval query quality predictors using genetic programming
Applied Intelligence
Hi-index | 0.00 |
We describe a method for predicting query difficulty in a precision-oriented web search task. Our approach uses visual features from retrieved surrogate document representations (titles, snippets, etc.) to predict retrieval effectiveness for a query. By training a supervised machine learning algorithm with manually evaluated queries, visual clues indicative of relevance are discovered. We show that this approach has a moderate correlation of 0.57 with precision at 10 scores from manual relevance judgments of the top ten documents retrieved by ten web search engines over 896 queries. Our findings indicate that difficulty predictors which have been successful in recall-oriented ad-hoc search, such as clarity metrics, are not nearly as correlated with engine performance in precision-oriented tasks such as this, yielding a maximum correlation of 0.3. Additionally, relying only on visual clues avoids the need for collection statistics that are required by these prior approaches. This enables our approach to be employed in environments where these statistics are unavailable or costly to retrieve, such as metasearch.