The merge/purge problem for large databases
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
CiteSeer: an automatic citation indexing system
Proceedings of the third ACM conference on Digital libraries
IEEE Transactions on Pattern Analysis and Machine Intelligence
Efficient clustering of high-dimensional data sets with application to reference matching
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
A guided tour to approximate string matching
ACM Computing Surveys (CSUR)
Learning object identification rules for information integration
Information Systems - Data extraction, cleaning and reconciliation
Interactive deduplication using active learning
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning to match and cluster large high-dimensional data sets for data integration
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Robust and efficient fuzzy match for online data cleaning
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Adaptive duplicate detection using learnable string similarity measures
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Iterative record linkage for cleaning and integration
Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
A hierarchical graphical model for record linkage
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Reference reconciliation in complex information spaces
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Semantic integration in text: from ambiguous names to identifiable entities
AI Magazine - Special issue on semantic integration
Eliminating fuzzy duplicates in data warehouses
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
BLOG: probabilistic models with unknown objects
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Measuring and extracting proximity in networks
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Adaptive graphical approach to entity resolution
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Measuring and extracting proximity graphs in networks
ACM Transactions on Knowledge Discovery from Data (TKDD)
Querying and Merging Heterogeneous Data by Approximate Joins on Higher-Order Terms
ILP '08 Proceedings of the 18th international conference on Inductive Logic Programming
Journal of Artificial Intelligence Research
Unsupervised methods for determining object and relation synonyms on the web
Journal of Artificial Intelligence Research
Generic entity resolution with negative rules
The VLDB Journal — The International Journal on Very Large Data Bases
Self-tuning in graph-based reference disambiguation
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
A graphical method for reference reconciliation
DASFAA'10 Proceedings of the 15th international conference on Database systems for advanced applications
A multilevel and domain-independent duplicate detection model for scientific database
WAIM'10 Proceedings of the 11th international conference on Web-age information management
A game theoretic framework for heterogenous information network clustering
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Duplicate detection through structure optimization
Proceedings of the 20th ACM international conference on Information and knowledge management
From names to entities using thematic context distance
Proceedings of the 20th ACM international conference on Information and knowledge management
Character-based kernels for novelistic plot structure
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Transforming graph data for statistical relational learning
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
In many applications, there are a variety of ways of referring to the same underlying entity. Given a collection of references to entities, we would like to determine the set of true underlying entities and map the references to these entities. The references may be to entities of different types and more than one type of entity may need to be resolved at the same time. We propose similarity measures for clustering references taking into account the different relations that are observed among the typed references. We pose typed entity resolution in relational data as a clustering problem and present experimental results on real data showing improvements over attribute-based models when relations are leveraged.