Do We Need Entity-Centric Knowledge Bases for Entity Disambiguation?

Authors:
Stefan Zwicklbauer;Christin Seifert;Michael Granitzer
Affiliations:
University of Passau, Passau, 94032 Germany;University of Passau, Passau, 94032 Germany;University of Passau, Passau, 94032 Germany
Venue:
Proceedings of the 13th International Conference on Knowledge Management and Knowledge Technologies
Year:
2013

Citing 14
Cited 0

A vector space model for automatic indexing

Communications of the ACM
A probabilistic model of information retrieval: development and comparative experiments

Information Processing and Management: an International Journal
Two supervised learning approaches for name disambiguation in author citations

Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Automatic Information Organization and Retrieval.

Automatic Information Organization and Retrieval.
Geographic Named Entity Disambiguation with Automatic Profile Generation

WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Wikify!: linking documents to encyclopedic knowledge

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Introduction to Information Retrieval

Introduction to Information Retrieval
Named entity disambiguation by leveraging wikipedia semantic knowledge

Proceedings of the 18th ACM conference on Information and knowledge management
Entity disambiguation for knowledge base population

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Local and global algorithms for disambiguation to Wikipedia

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Entity matching: how similar is similar

Proceedings of the VLDB Endowment
DBpedia spotlight: shedding light on the web of documents

Proceedings of the 7th International Conference on Semantic Systems
Model Selection Strategies for Author Disambiguation

DEXA '11 Proceedings of the 2011 22nd International Workshop on Database and Expert Systems Applications
Information retrieval and deduplication for tourism recommender sightsplanner

Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Entity Disambiguation has been studied extensively in the last 10 years with authors reporting increasingly well performing systems. However, most studies focused on general purpose knowledge bases like Wikipedia or DBPedia and left out the question how those results generalize to more specialized domains. This is especially important in the context of Linked Open Data which forms an enormous resource for disambiguation. However, the influence of domain heterogeneity, size and quality of the knowledge base remains largely unanswered. In this paper we present an extensive set of experiments on special purpose knowledge bases from the biomedical domain where we evaluate the disambiguation performance along four variables: (i) the representation of the knowledge base as being either entity-centric or document-centric, (ii) the size of the knowledge base in terms of entities covered, (iii) the semantic heterogeneity of a domain and (iv) the quality and completeness of a knowledge base. Our results show that for special purpose knowledge bases (i) document-centric disambiguation significantly outperforms entity-centric disambiguation, (ii) document-centric disambiguation does not depend on the size of the knowledge-base, while entity-centric approaches do, and (iii) disambiguation performance varies greatly across domains. These results suggest that domain-heterogeneity, size and knowledge base quality have to be carefully considered for the design of entity disambiguation systems and that for constructing knowledge bases user-annotated texts are preferable to carefully constructed knowledge bases.