Using titles and category names from editor-driven taxonomies for automatic evaluation

Authors:
Steven M. Beitzel;Eric C. Jensen;Abdur Chowdhury;David Grossman
Affiliations:
Illinois Institute of Technology;Illinois Institute of Technology;Illinois Institute of Technology;Illinois Institute of Technology
Venue:
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Year:
2003

Citing 19
Cited 9

Variations in relevance judgments and the measurement of retrieval effectiveness

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Finding information on the World Wide Web: the retrieval effectiveness of search engines

Information Processing and Management: an International Journal
Results and challenges in Web search evaluation

WWW '99 Proceedings of the eighth international conference on World Wide Web
First 20 precision among World Wide Web search services (search engines)

Journal of the American Society for Information Science
Analysis of a very large web search engine query log

ACM SIGIR Forum
Real life, real users, and real needs: a study and analysis of user queries on the web

Information Processing and Management: an International Journal
Evaluating evaluation measure stability

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Interactive Internet search: keyword, directory and query reformulation mechanisms compared

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A case study in web search using TREC algorithms

Proceedings of the 10th international conference on World Wide Web
Evaluation by highly relevant documents

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Why batch and user evaluations do not give the same results

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating strategies for similarity search on the web

Proceedings of the 11th international conference on World Wide Web
Automatic evaluation of world wide web search services

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Measuring Search Engine Quality

Information Retrieval
A new method for automatic performance comparison of search engines

World Wide Web
Precision Evaluation of Search Engines

World Wide Web
From E-Sex to E-Commerce: Web Search Changes

Computer
A taxonomy of web search

ACM SIGIR Forum
Using manually-built web directories for automatic evaluation of known-item retrieval

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval

Scaling IR-system evaluation using term relevance sets

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Hourly analysis of a very large topically categorized web query log

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic ranking of information retrieval systems using data fusion

Information Processing and Management: an International Journal
A machine learning based approach to evaluating retrieval systems

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Repeatable evaluation of search services in dynamic environments

ACM Transactions on Information Systems (TOIS)
Comparative analysis of clicks and judgments for IR evaluation

Proceedings of the 2009 workshop on Web Search Click Data
An overview of Web search evaluation methods

Computers and Electrical Engineering
Generating pseudo test collections for learning to rank scientific articles

CLEF'12 Proceedings of the Third international conference on Information Access Evaluation: multilinguality, multimodality, and visual analytics
PROMISE retreat report prospects and opportunities for information access evaluation

ACM SIGIR Forum

Quantified Score

Hi-index	0.01

Visualization

Abstract

Evaluation of IR systems has always been difficult because of the need for manually assessed relevance judgments. The advent of large editor-driven taxonomies on the web opens the door to a new evaluation approach. We use the ODP (Open Directory Project) taxonomy to find sets of pseudo-relevant documents via one of two assumptions: 1) taxonomy entries are relevant to a given query if their editor-entered titles exactly match the query, or 2) all entries in a leaf-level taxonomy category are relevant to a given query if the category title exactly matches the query. We compare and contrast these two methodologies by evaluating six web search engines on a sample from an America Online log of ten million web queries, using MRR measures for the first method and precision-based measures for the second. We show that this technique is stable with respect to the query set selected and correlated with a reasonably large manual evaluation.