Cognition and the visual arts
Twenty years of eye typing: systems and design issues
ETRA '02 Proceedings of the 2002 symposium on Eye tracking research & applications
Computer Vision
Interacting with groups of computers
Communications of the ACM
Learning words from sights and sounds: a computational model
Learning words from sights and sounds: a computational model
How many pixels do we need to see things?
ICCS'03 Proceedings of the 2003 international conference on Computational science: PartIII
Hi-index | 0.00 |
In contrast to the one-dimensional structure of natural language, images consist of two- or three-dimensional structures. This contrast in dimensionality causes the mapping between words and images to be a challenging, poorly understood and undertheorized task. In this paper, we present a general theoretical framework for semantic visual abstraction in massive image databases. Our framework applies specifically to facial identification and visual search for such recognition. It accommodates the by now commonplace observation that, through a graph-based visual abstraction, language allows humans to categorize objects and to provide verbal annotations to shapes. Our theoretical framework assumes a hidden layer between facial features and the referencing of expressive words. This hidden layer contains key points of correspondence that can be articulated mathematically, visually or verbally. A semantic visual abstraction network is designed for efficient facial recognition in massive visual datasets. In this paper, we demonstrate how a two-way mapping of words and facial shapes is feasible in facial information retrieval and reconstruction.