Adopting ontologies for multisource identity resolution

Authors:
Milena Yankova;Horacio Saggion;Hamish Cunningham
Affiliations:
Science University of Sheffield, United Kingdom;Science University of Sheffield, United Kingdom;Science University of Sheffield, United Kingdom
Venue:
OBI '08 Proceedings of the first international workshop on Ontology-supported business intelligence
Year:
2008

Citing 16
Cited 0

WordNet: a lexical database for English

Communications of the ACM
Information Extraction: Techniques and Challenges

SCIE '97 International Summer School on Information Extraction: A Multidisciplinary Approach to an Emerging Information Technology
SemTag and seeker: bootstrapping the semantic web via automated semantic annotation

WWW '03 Proceedings of the 12th international conference on World Wide Web
Entity-based cross-document coreferencing using the Vector Space Model

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Message Understanding Conference-6: a brief history

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
KIM – a semantic platform for information extraction and retrieval

Natural Language Engineering
University of Massachusetts: MUC-3 test results and analysis

MUC3 '91 Proceedings of the 3rd conference on Message understanding
University of Massachusetts: MUC-4 test results and analysis

MUC4 '92 Proceedings of the 4th conference on Message understanding
A common theory of information fusion from multiple text sources step one: cross-document structure

SIGDIAL '00 Proceedings of the 1st SIGdial workshop on Discourse and dialogue - Volume 10
Unsupervised personal name disambiguation

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Duplicate Record Detection: A Survey

IEEE Transactions on Knowledge and Data Engineering
Google Scholar citations and Google Web-URL citations: A multi-discipline exploratory analysis

Journal of the American Society for Information Science and Technology
Intelligent multimedia indexing and retrieval through multi-source information extraction and merging

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
OWLIM – a pragmatic semantic repository for OWL

WISE'05 Proceedings of the 2005 international conference on Web Information Systems Engineering
Mining information for instance unification

ISWC'06 Proceedings of the 5th international conference on The Semantic Web
NEWS: bringing semantic web technologies into news agencies

ISWC'06 Proceedings of the 5th international conference on The Semantic Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Identity resolution aims at identifying the newly presented facts and linking them to their previous mentions. Our main hypothesis is that variations of one and the same fact can be recognised, duplications removed and their aggregation actually increases the correctness of fact extraction. Our approach to the identity problem has been implemented as Identity Resolution Framework (IdRF). The framework provides a general solution identifying known and new facts in specific domains, and it can be used in different applications for processing of different types of entity. It uses an ontology for internal and resulting knowledge representational formalism. The ontology not only contains the representation of the domain, but also known entities and properties. Apart from extracting information from textual sources, we also exploit structured information available in databases mapping the database schema to the ontology and populating the ontology with existing knowledge. Our main goal is not to advocate one criterion among the others, but to introduce widely applicable solution of the identity resolution problem, we present a set of customisable criteria as well as a mechanism new criteria to be added. We have carried two series of experiments in two different business intelligence domains - company profiling and recruitment - achieving rather encouraging result.