Domain-Independent Entity Coreference for Linking Ontology Instances

Authors:
Dezhao Song;Jeff Heflin
Affiliations:
Lehigh University;Lehigh University
Venue:
Journal of Data and Information Quality (JDIQ) - Special Issue on Entity Resolution
Year:
2013

Citing 37
Cited 2

WordNet: a lexical database for English

Communications of the ACM
CiteSeer: an automatic citation indexing system

Proceedings of the third ACM conference on Digital libraries
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval

ECML '98 Proceedings of the 10th European Conference on Machine Learning
The DBLP Computer Science Bibliography: Evolution, Research Issues, Perspectives

SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
Adaptive duplicate detection using learnable string similarity measures

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Two supervised learning approaches for name disambiguation in author citations

Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Disambiguating Web appearances of people in a social network

WWW '05 Proceedings of the 14th international conference on World Wide Web
Reference reconciliation in complex information spaces

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Unsupervised personal name disambiguation

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Adaptive Name Matching in Information Integration

IEEE Intelligent Systems
Domain-independent data cleaning via analysis of entity-relationship graph

ACM Transactions on Database Systems (TODS)
Contextual search and name disambiguation in email using graphs

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Duplicate Record Detection: A Survey

IEEE Transactions on Knowledge and Data Engineering
Collective entity resolution in relational data

ACM Transactions on Knowledge Discovery from Data (TKDD)
Scaling up all pairs similarity search

Proceedings of the 16th international conference on World Wide Web
Adaptive sorted neighborhood methods for efficient record linkage

Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Efficient similarity joins for near duplicate detection

Proceedings of the 17th international conference on World Wide Web
Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Falcon-AO: A practical ontology matching system

Web Semantics: Science, Services and Agents on the World Wide Web
Ed-Join: an efficient algorithm for similarity joins with edit distance constraints

Proceedings of the VLDB Endowment
RiMOM: A Dynamic Multistrategy Ontology Alignment Framework

IEEE Transactions on Knowledge and Data Engineering
Learning blocking schemes for record linkage

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Ontology matching with semantic verification

Web Semantics: Science, Services and Agents on the World Wide Web
RKBExplorer.com: a knowledge driven infrastructure for linked data providers

ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
Semi-supervised learning by disagreement

Knowledge and Information Systems
Domain-independent entity coreference in RDF graphs

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Calculating Word Sense Probability Distributions for Semantic Web Applications

ICSC '10 Proceedings of the 2010 IEEE Fourth International Conference on Semantic Computing
On-the-fly entity-aware query processing in the presence of linkage

Proceedings of the VLDB Endowment
Efficient Techniques for Online Record Linkage

IEEE Transactions on Knowledge and Data Engineering
When owl: sameAs isn't the same: an analysis of identity in linked data

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Ontology-driven automatic entity disambiguation in unstructured text

ISWC'06 Proceedings of the 5th international conference on The Semantic Web
Mining information for instance unification

ISWC'06 Proceedings of the 5th international conference on The Semantic Web
Name discrimination by clustering similar contexts

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Efficient semantic-aware detection of near duplicate resources

ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part II
Leveraging terminological structure for object reconciliation

ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part II
Leveraging unlabeled data to scale blocking for record linkage

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three

Scalable and domain-independent entity coreference: establishing high quality data linkages across heterogeneous data sources

ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part II
Accuracy vs. Speed: Scalable Entity Coreference on the Semantic Web with On-the-Fly Pruning

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01

Quantified Score

Hi-index	0.00

Visualization

Abstract

The objective of entity coreference is to determine if different mentions (e.g., person names, place names, database records, ontology instances, etc.) refer to the same real word object. Entity coreference algorithms can be used to detect duplicate database records and to determine if two Semantic Web instances represent the same underlying real word entity. The key issues in developing an entity coreference algorithm include how to locate context information and how to utilize the context appropriately. In this article, we present a novel entity coreference algorithm for ontology instances. For scalability reasons, we select a neighborhood of each instance from an RDF graph. To determine the similarity between two instances, our algorithm computes the similarity between comparable property values in the neighborhood graphs. The similarity of distinct URIs and blank nodes is computed by comparing their outgoing links. In an attempt to reduce the impact of distant nodes on the final similarity measure, we explore a distance-based discounting approach. To provide the best possible domain-independent matches, we propose an approach to compute the discriminability of triples in order to assign weights to the context information. We evaluated our algorithm using different instance categories from five datasets. Our experiments show that the best results are achieved by including both our discounting and triple discrimination approaches.