PixelTone: a multimodal interface for image editing

Authors:
Gierad P. Laput;Mira Dontcheva;Gregg Wilensky;Walter Chang;Aseem Agarwala;Jason Linder;Eytan Adar
Affiliations:
University of Michigan, Ann Arbor, Michigan & Adobe Research, San Francisco, California, USA;Adobe Research, San Francisco, California, USA;Adobe Research, San Francisco, California, USA;Adobe Research, San Francisco, California, USA;Adobe Research, San Francisco, California, USA;Adobe Research, San Francisco, California, USA;University of Michigan, Ann Arbor, Michigan, USA
Venue:
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Year:
2013

Citing 18
Cited 0

Speech and gestures for graphic image manipulation

CHI '89 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Supporting creative work tasks: the potential of multimodal tools to support sketching

C&C '99 Proceedings of the 3rd conference on Creativity & cognition
Towards a natural language interface for CAD

DAC '85 Proceedings of the 22nd ACM/IEEE Design Automation Conference
A Comparison of Graphics and Speech in a Task-Oriented Interactio

Diagrams '00 Proceedings of the First International Conference on Theory and Application of Diagrams
An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet

CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
“Put-that-there”: Voice and gesture at the graphics interface

SIGGRAPH '80 Proceedings of the 7th annual conference on Computer graphics and interactive techniques
Augmenting user interfaces with adaptive speech commands

Proceedings of the 5th international conference on Multimodal interfaces
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Modality fusion for graphic design applications

Proceedings of the 6th international conference on Multimodal interfaces
Translating keyword commands into executable code

UIST '06 Proceedings of the 19th annual ACM symposium on User interface software and technology
Inky: a sloppy command line for the web with rich visual feedback

Proceedings of the 21st annual ACM symposium on User interface software and technology
Multimodal interactive maps: designing for human performance

Human-Computer Interaction
Usability evaluation of a Volkswagen Group in-vehicle speech system

Proceedings of the 1st International Conference on Automotive User Interfaces and Interactive Vehicular Applications
Using measures of semantic relatedness for word sense disambiguation

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
A conversational interface to web automation

UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
Multi-exposure imaging on mobile devices

Proceedings of the international conference on Multimedia
Developing accessible TV applications

The proceedings of the 13th international ACM SIGACCESS conference on Computers and accessibility

Quantified Score

Hi-index	0.01

Visualization

Abstract

Photo editing can be a challenging task, and it becomes even more difficult on the small, portable screens of mobile devices that are now frequently used to capture and edit images. To address this problem we present PixelTone, a multimodal photo editing interface that combines speech and direct manipulation. We observe existing image editing practices and derive a set of principles that guide our design. In particular, we use natural language for expressing desired changes to an image, and sketching to localize these changes to specific regions. To support the language commonly used in photo-editing we develop a customized natural language interpreter that maps user phrases to specific image processing operations. Finally, we perform a user study that evaluates and demonstrates the effectiveness of our interface.