Quality assurance in document conversion: a hit?

Authors:
Christoph Becker
Affiliations:
Vienna University of Technology, Vienna, Austria
Venue:
Proceedings of the 4th ACM workshop on Online books, complementary social media and crowdsourcing
Year:
2011

Citing 8
Cited 2

Designing games with a purpose

Communications of the ACM - Designing games with a purpose
Input-agreement: a new mechanism for collecting data using human computation games

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Towards methods for the collective gathering and quality control of relevance assessments

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Quality management on Amazon Mechanical Turk

Proceedings of the ACM SIGKDD Workshop on Human Computation
Human computation: a survey and taxonomy of a growing field

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Decision criteria in digital preservation: What to measure and how

Journal of the American Society for Information Science and Technology
Crowdsourcing for book search evaluation: impact of hit design on comparative system ranking

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Automated preservation: the case of digital raw photographs

ICADL'11 Proceedings of the 13th international conference on Asia-pacific digital libraries: for cultural heritage, knowledge dissemination, and future creation

BooksOnline'11: 4th workshop on online books, complementary social media, and crowdsourcing

Proceedings of the 20th ACM international conference on Information and knowledge management
Report on BooksOnline'11: 4th workshop on online books, complementary social media, and crowdsourcing

ACM SIGIR Forum

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper discusses challenges and opportunities of using human computation and crowdsourcing for the task of quality assurance in document conversion processes and proposes a hybrid computer-human system approach. Digital content is never presented to a user directly, but always needs an intermediate presentation that is generated through an algorithm (such as a document viewer) that interprets data. When converting data such as documents, the question of authenticity of the derived representation of these documents requires a comparison of the intellectually perceivable outcome of different interpretations. Such Quality Assurance is a key obstacle to scalability in document conversion processes. Currently, there is a severe lack of scalable techniques. We argue that this comparison is a Human Intelligence Task (HIT). To investigate the feasibility, potential pitfalls and key challenges in leveraging the wisdom of the crowd for this task, we have conducted several pilot experiments. We describe and discuss these experiments, and identify a number of key challenges that need to be addressed. In particular, we discuss the questions of motivation; task semantics; presentation and interaction design; and quality control. Finally, we outline a proposal to address these challenges in a hybrid computer-human system.