SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Using corpus statistics and WordNet relations for sense identification
Computational Linguistics - Special issue on word sense disambiguation
Verbs semantics and lexical selection
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
On the detection of semantic concepts at TRECVID
Proceedings of the 12th annual ACM international conference on Multimedia
Semantic similarity methods in wordNet and their application to information retrieval on the web
Proceedings of the 7th annual ACM international workshop on Web information and data management
Semantic concept-based query expansion and re-ranking for multimedia retrieval
Proceedings of the 15th international conference on Multimedia
WordNet::Similarity: measuring the relatedness of concepts
HLT-NAACL--Demonstrations '04 Demonstration Papers at HLT-NAACL 2004
Using information content to evaluate semantic similarity in a taxonomy
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Using measures of semantic relatedness for word sense disambiguation
CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Video retrieval using high level features: exploiting query matching and confidence-based weighting
CIVR'06 Proceedings of the 5th international conference on Image and Video Retrieval
Adding Semantics to Detectors for Video Retrieval
IEEE Transactions on Multimedia
Foundations and Trends in Information Retrieval
Semantic context transfer across heterogeneous sources for domain adaptive video search
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Web news categorization using a cross-media document graph
Proceedings of the ACM International Conference on Image and Video Retrieval
An integrated semantic-based approach in concept based video retrieval
Multimedia Tools and Applications
Hi-index | 0.00 |
Semantic similarity between words or phrases is frequently used to find matching correlations between search queries and documents when straightforward matching of terms fails. This is particularly important for searching in visual databases, where pictures or video clips have been automatically tagged with a small set of semantic concepts based on analysis and classification of the visual content. Here, the textual description of documents is very limited, and semantic similarity based on WordNet's cognitive synonym structure, along with information content derived from term frequencies, can help to bridge the gap between an arbitrary textual query and a limited vocabulary of visual concepts. This approach, termed concept-based retrieval, has received significant attention over the last few years, and its success is highly dependent on the quality of the similarity measure used to map textual query terms to visual concepts. In this paper, we consider some issues of semantic similarity measures based on Information Content (IC), and propose a way to improve them. In particular, we note that most IC-based similarity measures are derived from a small and relatively outdated corpus (the Brown corpus), which does not adequately capture the usage pattern of many contemporary terms: for example, out of more than 150,000 WordNet terms, only about 36,000 are represented. This shortcoming reflects very negatively on the coverage of typical search query terms. We therefore suggest using alternative IC corpora that are larger and better aligned with the usage of modern vocabulary. We experimentally derive two such corpora using the WWW Google search engine, and show that they provide better coverage of vocabulary, while showing comparable frequencies for Brown corpus terms. Finally, we evaluate the two proposed IC corpora in the context of a concept-based video retrieval application using the TRECVID 2005, 2006, and 2007 datasets, and we show that they increase average precision results by up to 200%.