Apples to oranges: evaluating image annotations from natural language processing systems
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Hi-index | 0.00 |
We automatically construct a dictionary of visual (possible to perceive on a picture) or non-visual (impossible to perceive directly on a picture) entities and attributes, based on statistical association techniques used in data mining. We compute whether certain words that could function as entities or attributes of an entity are correlated with texts that describe images and use these words for the detection of visual nouns and visual adjectives. We compare our corpus-based approach with a knowledge-rich approach based on WordNet, and with a combination of both approaches.