A new approach to cross-modal multimedia retrieval
Proceedings of the international conference on Multimedia
Hi-index | 0.00 |
This paper presents a novel approach for image retrieval by combining textual and object-based visual features in order to reduce the inconsistency between the subjective user's similarity interpretation and the retrieval results produced by objective similarity models. A novel multi-scale segmentation framework is proposed to detect prominent image objects. These objects are clustered according to their visual features and mapped to related words determined by psychophysical studies. Furthermore, a hierarchy of words expressing higher-level meaning is determined on the basis of natural language processing and user evaluation. Experiments conducted on a large set of natural images showed that higher retrieval precision in terms of estimating user retrieval semantics could be achieved via this two-layer word association and also by supporting various query specifications and options.