The cost structure of sensemaking
INTERCHI '93 Proceedings of the INTERCHI '93 conference on Human factors in computing systems
Selection of relevant features and examples in machine learning
Artificial Intelligence - Special issue on relevance
Sorting things out: classification and its consequences
Sorting things out: classification and its consequences
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
The Journal of Machine Learning Research
Labeling images with a computer game
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Do categories have politics? the language/action perspective reconsidered
ECSCW'93 Proceedings of the third conference on European Conference on Computer-Supported Cooperative Work
Solving Non-Uniqueness in Agglomerative Hierarchical Clustering Using Multidendrograms
Journal of Classification
Articulations of wikiwork: uncovering valued work in wikipedia through barnstars
Proceedings of the 2008 ACM conference on Computer supported cooperative work
CHI '09 Extended Abstracts on Human Factors in Computing Systems
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Who gives a tweet?: evaluating microblog content value
Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work
Strategies for crowdsourcing social data analysis
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Interpretation and trust: designing model-driven visualizations for text analysis
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Semantic interaction for visual text analytics
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Termite: visualization techniques for assessing textual topic models
Proceedings of the International Working Conference on Advanced Visual Interfaces
Cascade: crowdsourcing taxonomy creation
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Gender, topic, and audience response: an analysis of user-generated content on facebook
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Hi-index | 0.00 |
Analysts synthesize complex, qualitative data to uncover themes and concepts, but the process is time-consuming, cognitively taxing, and automated techniques show mixed success. Crowdsourcing could help this process through on-demand harnessing of flexible and powerful human cognition, but incurs other challenges including limited attention and expertise. Further, text data can be complex, high-dimensional, and ill-structured. We address two major challenges unsolved in prior crowd clustering work: scaffolding expertise for novice crowd workers, and creating consistent and accurate categories when each worker only sees a small portion of the data. To address these challenges we present an empirical study of a two-stage approach to enable crowds to create an accurate and useful overview of a dataset: A) we draw on cognitive theory to assess how re-representing data can shorten and focus the data on salient dimensions; and B) introduce an iterative clustering approach that provides workers a global overview of data. We demonstrate a classification-plus-context approach elicits the most accurate categories at the most useful level of abstraction.