Assessing effectiveness in video retrieval

Authors:
Alexander Hauptmann;Wei-Hao Lin
Affiliations:
Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA;Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA
Venue:
CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
Year:
2005

Citing 4
Cited 3

The effect of topic set size on retrieval experiment error

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
CLEF 2001 - Overview of Results

CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
Learning query-class dependent weights in automatic video retrieval

Proceedings of the 12th annual ACM international conference on Multimedia
Revisiting the effect of topic set size on retrieval error

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval

Concept-Based Video Retrieval

Foundations and Trends in Information Retrieval
Today's and tomorrow's retrieval practice in the audiovisual archive

Proceedings of the ACM International Conference on Image and Video Retrieval
Learning concept bundles for video search with complex queries

MM '11 Proceedings of the 19th ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper examines results from the last two years of the TRECVID video retrieval evaluations. While there is encouraging evidence about progress in video retrieval, there are several major disappointments confirming that the field of video retrieval is still in its infancy. Many publications blithely attribute improvements in retrieval tasks to the different techniques without paying much attention to the statistical reliability of the comparisons. We conduct an analysis of the official TRECVID evaluation results, using both retrieval experiment error rates and ANOVA measures, and demonstrate that the difference between many systems is not statistically significant. We conclude the paper with the lessons learned from both results with and without statistically significant difference.