Integrating feature analysis and background knowledge to recommend similarity functions

Authors:
Seung Hwan Ryu;Boualem Benatallah
Affiliations:
School of Computer Science & Engineering, University of New South Wales, Sydney, NSW, Australia;School of Computer Science & Engineering, University of New South Wales, Sydney, NSW, Australia
Venue:
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Year:
2012

Citing 14
Cited 0

Interestingness via what is not interesting

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Approximate String Matching

ACM Computing Surveys (CSUR)
Learning domain-independent string transformation weights for high accuracy object identification

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
TAILOR: A Record Linkage Tool Box

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Adaptive duplicate detection using learnable string similarity measures

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Reference reconciliation in complex information spaces

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Adaptive Name Matching in Information Integration

IEEE Intelligent Systems
Duplicate Record Detection: A Survey

IEEE Transactions on Knowledge and Data Engineering
A Comparison of Personal Name Matching: Techniques and Practical Issues

ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
Example-driven design of efficient record matching queries

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Learning-Based Approaches for Matching Web Data Entities

IEEE Internet Computing
Fast-join: An efficient method for fuzzy token matching based string similarity join

ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Efficient similarity search: arbitrary similarity measures, arbitrary composition

Proceedings of the 20th ACM international conference on Information and knowledge management
Similarity function recommender service using incremental user knowledge acquisition

ICSOC'11 Proceedings of the 9th international conference on Service-Oriented Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Existing approaches in similarity analysis is little concerned with the right choice of similarity functions. We present an approach for suggesting which similarity functions (e.g., edit distance) are most appropriate for a given similarity search task. We identify data features (e.g., misspellings) that are considerable when choosing similarity functions. We also introduce the concept of similarity function background knowledge that associates data features with similarity functions, and apply the knowledge to recommend suitable similarity functions.