Answering visual questions with conversational crowd assistants

Authors:
Walter S. Lasecki;Phyo Thiha;Yu Zhong;Erin Brady;Jeffrey P. Bigham
Affiliations:
University of Rochester;University of Rochester;University of Rochester;University of Rochester;University of Rochester and Carnegie Mellon University
Venue:
Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility
Year:
2013

Citing 13
Cited 2

Labeling images with a computer game

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Slide rule: making mobile touch screens accessible to blind people using multi-touch interaction techniques

Proceedings of the 10th international ACM SIGACCESS conference on Computers and accessibility
Freedom to roam: a study of mobile device adoption and accessibility for people with visual and motor disabilities

Proceedings of the 11th international ACM SIGACCESS conference on Computers and accessibility
TurKit: human computation algorithms on mechanical turk

UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
Soylent: a word processor with a crowd inside

UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
VizWiz: nearly real-time answers to visual questions

UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
EasySnap: real-time audio feedback for blind photography

UIST '10 Adjunct proceedings of the 23nd annual ACM symposium on User interface software and technology
Real-time crowd control of existing interfaces

Proceedings of the 24th annual ACM symposium on User interface software and technology
The design of human-powered access technology

The proceedings of the 13th international ACM SIGACCESS conference on Computers and accessibility
Real-time captioning by groups of non-experts

Proceedings of the 25th annual ACM symposium on User interface software and technology
Real-time crowd labeling for deployable activity recognition

Proceedings of the 2013 conference on Computer supported cooperative work
Visual challenges in the everyday lives of blind people

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Chorus: a crowd-powered conversational assistant

Proceedings of the 26th annual ACM symposium on User interface software and technology

Chorus: a crowd-powered conversational assistant

Proceedings of the 26th annual ACM symposium on User interface software and technology
Information extraction and manipulation threats in crowd-powered systems

Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Blind people face a range of accessibility challenges in their everyday lives, from reading the text on a package of food to traveling independently in a new place. Answering general questions about one's visual surroundings remains well beyond the capabilities of fully automated systems, but recent systems are showing the potential of engaging on-demand human workers (the crowd) to answer visual questions. The input to such systems has generally been a single image, which can limit the interaction with a worker to one question; or video streams where systems have paired the end user with a single worker, limiting the benefits of the crowd. In this paper, we introduce Chorus:View, a system that assists users over the course of longer interactions by engaging workers in a continuous conversation with the user about a video stream from the user's mobile device. We demonstrate the benefit of using multiple crowd workers instead of just one in terms of both latency and accuracy, then conduct a study with 10 blind users that shows Chorus:View answers common visual questions more quickly and accurately than existing approaches. We conclude with a discussion of users' feedback and potential future work on interactive crowd support of blind users.