Cascade: crowdsourcing taxonomy creation

Authors:
Lydia B. Chilton;Greg Little;Darren Edge;Daniel S. Weld;James A. Landay
Affiliations:
University of Washington, Seattle, Washington, USA;oDesk, Redwood City, California, USA;Microsoft Research Asia, Beijing, Beijing, China;University of Washington, Seattle, Washington, USA;University of Washington, Seattle, Washington, USA
Venue:
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Year:
2013

Citing 12
Cited 3

Latent dirichlet allocation

The Journal of Machine Learning Research
Labeling images with a computer game

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
TurKit: human computation algorithms on mechanical turk

UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
Soylent: a word processor with a crowd inside

UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
VizWiz: nearly real-time answers to visual questions

UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
CrowdForge: crowdsourcing complex work

Proceedings of the 24th annual ACM symposium on User interface software and technology
Supporting reflective public thought with considerit

Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work
Collaboratively crowdsourcing workflows with turkomatic

Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work
Interactively building a discriminative vocabulary of nameable attributes

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Human computation tasks with global constraints

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
CommunitySourcing: engaging local crowds to perform expert work via physical kiosks

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Crowdsourcing taxonomies

ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications

Finding action dependencies using the crowd

Proceedings of the seventh international conference on Knowledge capture
READFAST: high-relevance search-engine for big text

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Crowd synthesis: extracting categories and clusters from complex data

Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing

Quantified Score

Hi-index	0.01

Visualization

Abstract

Taxonomies are a useful and ubiquitous way of organizing information. However, creating organizational hierarchies is difficult because the process requires a global understanding of the objects to be categorized. Usually one is created by an individual or a small group of people working together for hours or even days. Unfortunately, this centralized approach does not work well for the large, quickly changing datasets found on the web. Cascade is an automated workflow that allows crowd workers to spend as little at 20 seconds each while collectively making a taxonomy. We evaluate Cascade and show that on three datasets its quality is 80-90% of that of experts. Cascade has a competitive cost to expert information architects, despite taking six times more human labor. Fortunately, this labor can be parallelized such that Cascade will run in as fast as four minutes instead of hours or days.