A general evaluation measure for document organization tasks

Authors:
Enrique Amigó;Julio Gonzalo;Felisa Verdejo
Affiliations:
UNED, Madrid, Spain;UNED, Madrid, Spain;UNED, Madrid, Spain
Venue:
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Year:
2013

Citing 18
Cited 1

Reexamining the cluster hypothesis: scatter/gather on retrieval results

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating document clustering for interactive information retrieval

Proceedings of the tenth international conference on Information and knowledge management
Machine Learning

Machine Learning
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
On Clustering Validation Techniques

Journal of Intelligent Information Systems
Retrieval evaluation with incomplete information

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Novelty and diversity in information retrieval evaluation

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A few chirps about twitter

Proceedings of the first workshop on Online social networks
Rank-biased precision for measurement of retrieval effectiveness

ACM Transactions on Information Systems (TOIS)
A dynamic bayesian network click model for web search ranking

Proceedings of the 18th international conference on World wide web
A comparison of extrinsic clustering evaluation metrics based on formal constraints

Information Retrieval
Characterizing search intent diversity into click models

Proceedings of the 20th international conference on World wide web
System effectiveness, user models, and user utility: a conceptual framework for investigation

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Rank and relevance in novelty and diversity metrics for recommender systems

Proceedings of the fifth ACM conference on Recommender systems
A comparison of evaluation metrics for document filtering

CLEF'11 Proceedings of the Second international conference on Multilingual and multimodal information access evaluation
Simulating simple user behavior for system effectiveness evaluation

Proceedings of the 20th ACM international conference on Information and knowledge management
Automatic selection of noun phrases as document descriptors in an FCA-Based information retrieval system

ICFCA'05 Proceedings of the Third international conference on Formal Concept Analysis
Time-based calibration of effectiveness measures

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

An unsupervised transfer learning approach to discover topics for online reputation management

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

A number of key Information Access tasks -- Document Retrieval, Clustering, Filtering, and their combinations -- can be seen as instances of a generic {\em document organization} problem that establishes priority and relatedness relationships between documents (in other words, a problem of forming and ranking clusters). As far as we know, no analysis has been made yet on the evaluation of these tasks from a global perspective. In this paper we propose two complementary evaluation measures -- Reliability and Sensitivity -- for the generic Document Organization task which are derived from a proposed set of formal constraints (properties that any suitable measure must satisfy). In addition to be the first measures that can be applied to any mixture of ranking, clustering and filtering tasks, Reliability and Sensitivity satisfy more formal constraints than previously existing evaluation metrics for each of the subsumed tasks. Besides their formal properties, its most salient feature from an empirical point of view is their strictness: a high score according to the harmonic mean of Reliability and Sensitivity ensures a high score with any of the most popular evaluation metrics in all the Document Retrieval, Clustering and Filtering datasets used in our experiments.