WordNet: a lexical database for English
Communications of the ACM
Information Extraction: Techniques and Challenges
SCIE '97 International Summer School on Information Extraction: A Multidisciplinary Approach to an Emerging Information Technology
SemTag and seeker: bootstrapping the semantic web via automated semantic annotation
WWW '03 Proceedings of the 12th international conference on World Wide Web
Entity-based cross-document coreferencing using the Vector Space Model
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Message Understanding Conference-6: a brief history
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
KIM – a semantic platform for information extraction and retrieval
Natural Language Engineering
University of Massachusetts: MUC-3 test results and analysis
MUC3 '91 Proceedings of the 3rd conference on Message understanding
University of Massachusetts: MUC-4 test results and analysis
MUC4 '92 Proceedings of the 4th conference on Message understanding
A common theory of information fusion from multiple text sources step one: cross-document structure
SIGDIAL '00 Proceedings of the 1st SIGdial workshop on Discourse and dialogue - Volume 10
Unsupervised personal name disambiguation
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
Google Scholar citations and Google Web-URL citations: A multi-discipline exploratory analysis
Journal of the American Society for Information Science and Technology
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
OWLIM – a pragmatic semantic repository for OWL
WISE'05 Proceedings of the 2005 international conference on Web Information Systems Engineering
Mining information for instance unification
ISWC'06 Proceedings of the 5th international conference on The Semantic Web
NEWS: bringing semantic web technologies into news agencies
ISWC'06 Proceedings of the 5th international conference on The Semantic Web
Hi-index | 0.00 |
Identity resolution aims at identifying the newly presented facts and linking them to their previous mentions. Our main hypothesis is that variations of one and the same fact can be recognised, duplications removed and their aggregation actually increases the correctness of fact extraction. Our approach to the identity problem has been implemented as Identity Resolution Framework (IdRF). The framework provides a general solution identifying known and new facts in specific domains, and it can be used in different applications for processing of different types of entity. It uses an ontology for internal and resulting knowledge representational formalism. The ontology not only contains the representation of the domain, but also known entities and properties. Apart from extracting information from textual sources, we also exploit structured information available in databases mapping the database schema to the ontology and populating the ontology with existing knowledge. Our main goal is not to advocate one criterion among the others, but to introduce widely applicable solution of the identity resolution problem, we present a set of customisable criteria as well as a mechanism new criteria to be added. We have carried two series of experiments in two different business intelligence domains - company profiling and recruitment - achieving rather encouraging result.