Get another label? improving data quality and data mining using multiple, noisy labelers
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Crowdsourcing for relevance evaluation
ACM SIGIR Forum
Matching Schemas in Online Communities: A Web 2.0 Approach
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Supervised learning from multiple experts: whom to trust when everyone lies a bit
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Efficiently learning the accuracy of labeling sources for selective sampling
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
TurKit: tools for iterative tasks on mechanical Turk
Proceedings of the ACM SIGKDD Workshop on Human Computation
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Feasibility of human-in-the-loop minimum error rate training
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Communications of the ACM
Crowdsourcing systems on the World-Wide Web
Communications of the ACM
Everyone's an influencer: quantifying influence on twitter
Proceedings of the fourth ACM international conference on Web search and data mining
Human-assisted graph search: it's okay to ask questions
Proceedings of the VLDB Endowment
Human computation: a survey and taxonomy of a growing field
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
CrowdDB: answering queries with crowdsourcing
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Demonstration of Qurk: a query processor for humanoperators
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Proceedings of the VLDB Endowment
Deco: declarative crowdsourcing
Proceedings of the 21st ACM international conference on Information and knowledge management
Using the crowd for top-k and group-by queries
Proceedings of the 16th International Conference on Database Theory
CrowdSeed: query processing on microblogs
Proceedings of the 16th International Conference on Extending Database Technology
Leveraging transitive relations for crowdsourced joins
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
An online cost sensitive decision-making method in crowdsourcing systems
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Towards a generic framework for trustworthy spatial crowdsourcing
Proceedings of the 12th International ACM Workshop on Data Engineering for Wireless and Mobile Acess
Evaluating the crowd with confidence
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Optimizing plurality for human intelligence tasks
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
A human-machine method for web table understanding
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Mobility and social networking: a data management perspective
Proceedings of the VLDB Endowment
Answering planning queries with the crowd
Proceedings of the VLDB Endowment
Maximum Complex Task Assignment: Towards Tasks Correlation in Spatial Crowdsourcing
Proceedings of International Conference on Information Integration and Web-based Applications & Services
Learning an accurate entity resolution model from crowdsourced labels
Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication
Hi-index | 0.00 |
Given a large set of data items, we consider the problem of filtering them based on a set of properties that can be verified by humans. This problem is commonplace in crowdsourcing applications, and yet, to our knowledge, no one has considered the formal optimization of this problem. (Typical solutions use heuristics to solve the problem.) We formally state a few different variants of this problem. We develop deterministic and probabilistic algorithms to optimize the expected cost (i.e., number of questions) and expected error. We experimentally show that our algorithms provide definite gains with respect to other strategies. Our algorithms can be applied in a variety of crowdsourcing scenarios and can form an integral part of any query processor that uses human computation.