Multi-class object detection with hough forests using local histograms of visual words
CAIP'11 Proceedings of the 14th international conference on Computer analysis of images and patterns - Volume Part I
Hi-index | 0.00 |
State-of-the-art systems for generic concept detection rely on low-level features, and in some cases additionally on features based on face detection, optical character recognition and/or speech recognition. In this paper, an approach for the task of semantic video retrieval is presented that systematically utilizes results of specialized object detectors. Using these object detectors trained on separate public data sets, object-based features are generated by assembling detection results to object sequences. A shot-based confidence score as well as further features, such as position, frame coverage and movement, are computed for each object class. Experimental results on TRECVID test data show significant improvements in terms of retrieval performance not only for the object classes, but also in particular for a large number of indirectly related concepts.