Evaluation of entity resolution approaches on real-world match problems
Proceedings of the VLDB Endowment
Multi-pass sorted neighborhood blocking with MapReduce
Computer Science - Research and Development
Tailoring entity resolution for matching product offers
Proceedings of the 15th International Conference on Extending Database Technology
Integrating feature analysis and background knowledge to recommend similarity functions
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Hi-index | 0.00 |
Entity matching is a key task for data integration and especially challenging for Web data. Effective entity matching typically requires combining several match techniques and finding suitable configuration parameters, such as similarity thresholds. The authors investigate to what degree machine learning helps semi-automatically determine suitable match strategies with a limited amount of manual training effort. They use a new framework, Fever, to evaluate several learning-based approaches for matching different sets of Web data entities. In particular, they study different approaches for training-data selection and how much training is needed to find effective combined match strategies and configurations.