The stable marriage problem: structure and algorithms
The stable marriage problem: structure and algorithms
The JPEG still picture compression standard
Communications of the ACM - Special issue on digital multimedia systems
The scientist and engineer's guide to digital signal processing
The scientist and engineer's guide to digital signal processing
Multiple Comparisons in Induction Algorithms
Machine Learning
Efficient clustering of high-dimensional data sets with application to reference matching
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem
Data Mining and Knowledge Discovery
Iterative record linkage for cleaning and integration
Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Reference reconciliation in complex information spaces
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Query chains: learning to rank from implicit feedback
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Record linkage: similarity measures and algorithms
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Improving web search ranking by incorporating user behavior information
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
Collective entity resolution in relational data
ACM Transactions on Knowledge Discovery from Data (TKDD)
Leveraging aggregate constraints for deduplication
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
IEEE Transactions on Computers
A secured collaborative model for data integration in life sciences
Transactions on large-scale data- and knowledge-centered systems IV
Linking records in dynamic world
PhD '12 Proceedings of the on SIGMOD/PODS 2012 PhD Symposium
Flexible and efficient distributed resolution of large entities
FoIKS'12 Proceedings of the 7th international conference on Foundations of Information and Knowledge Systems
The data analytics group at the qatar computing research institute
ACM SIGMOD Record
Adaptive Connection Strength Models for Relationship-Based Entity Resolution
Journal of Data and Information Quality (JDIQ) - Special Issue on Entity Resolution
MFIBlocks: An effective blocking algorithm for entity resolution
Information Systems
FusionDB: conflict management system for small-science databases
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Query-driven approach to entity resolution
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
In this paper, we present a new record linkage approach that uses entity behavior to decide if potentially different entities are in fact the same. An entity's behavior is extracted from a transaction log that records the actions of this entity with respect to a given data source. The core of our approach is a technique that merges the behavior of two possible matched entities and computes the gain in recognizing behavior patterns as their matching score. The idea is that if we obtain a well recognized behavior after merge, then most likely, the original two behaviors belong to the same entity as the behavior becomes more complete after the merge. We present the necessary algorithms to model entities' behavior and compute a matching score for them. To improve the computational efficiency of our approach, we precede the actual matching phase with a fast candidate generation that uses a "quick and dirty" matching method. Extensive experiments on real data show that our approach can significantly enhance record linkage quality while being practical for large transaction logs.