Multi-domain sketch understanding

  • Authors:
  • Christine J. Alvarado;Randall Davis

  • Affiliations:
  • Massachusetts Institute of Technology;Massachusetts Institute of Technology

  • Venue:
  • Multi-domain sketch understanding
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

People use sketches to express and record their ideas in many domains, including mechanical engineering, software design, and information architecture. In recent years there has been an increasing interest in sketch-based user interfaces, but the problem of robust free-sketch recognition remains largely unsolved. Current computer sketch recognition systems are difficult to construct, and either are fragile or accomplish robustness by severely limiting the designer's drawing freedom. This work explores the challenges of multi-domain sketch recognition. We present a general framework and implemented system, called SketchREAD , for diagrammatic sketch recognition. Our system can be applied to a variety of domains by providing structural descriptions of the shapes in the domain. Robustness to the ambiguity and uncertainty inherent in complex, freely-drawn sketches is achieved through the use of context. Our approach uses context to guide the search for possible interpretations and uses a novel form of dynamically constructed Bayesian networks to evaluate these interpretations. This process allows the system to recover from low-level recognition errors (e.g., a line misclassified as an arc) that would otherwise result in domain level recognition errors. We evaluated SketchREAD on real sketches in two domains—family trees and circuit diagrams—and found that in both domains the use of context to reclassify low-level shapes significantly reduced recognition error over a baseline system that did not reinterpret low-level classifications. We discuss remaining challenges for multi-domain sketch recognition revealed by our evaluation. Finally, we explore the system's potential role in sketch-based user interfaces from a human computer interaction perspective. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)