Interactive deduplication using active learning
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning domain-independent string transformation weights for high accuracy object identification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
Adaptive duplicate detection using learnable string similarity measures
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Grouping search-engine returned citations for person-name queries
Proceedings of the 6th annual ACM international workshop on Web information and data management
Disambiguating Web appearances of people in a social network
WWW '05 Proceedings of the 14th international conference on World Wide Web
Person resolution in person search results: WebHawk
Proceedings of the 14th ACM international conference on Information and knowledge management
Unsupervised personal name disambiguation
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Adaptive Name Matching in Information Integration
IEEE Intelligent Systems
Get another label? improving data quality and data mining using multiple, noisy labelers
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Creating personal histories from the web using namesake disambiguation and event extraction
ICWE'07 Proceedings of the 7th international conference on Web engineering
Collecting high quality overlapping labels at low cost
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the VLDB Endowment
CrowdScreen: algorithms for filtering data with humans
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
CrowdER: crowdsourcing entity resolution
Proceedings of the VLDB Endowment
Detecting Anomalies in Bipartite Graphs with Mutual Dependency Principles
ICDM '12 Proceedings of the 2012 IEEE 12th International Conference on Data Mining
Hi-index | 0.00 |
We investigated the use of supervised learning methods that use labels from crowd workers to resolve entities. Although obtaining labeled data by crowdsourcing can reduce time and cost, it also brings challenges (e.g., coping with the variable quality of crowd-generated data). First, we evaluated the quality of crowd-generated labels for actual entity resolution data sets. Then, we evaluated the prediction accuracy of two machine learning methods that use labels from crowd workers: a conventional LPP method using consensus labels obtained by majority voting and our proposed method that combines multiple Laplacians directly by using crowdsourced data. We discussed the relationship between the accuracy of workers' labels and the prediction accuracy of the two methods.