Entity linking at web scale

Authors:
Thomas Lin; Mausam;Oren Etzioni
Affiliations:
University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA
Venue:
AKBC-WEKEX '12 Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction
Year:
2012

Citing 11
Cited 4

Freebase: a collaboratively created graph database for structuring human knowledge

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Learning to link with wikipedia

Proceedings of the 17th ACM conference on Information and knowledge management
Using Wikipedia to bootstrap open information extraction

ACM SIGMOD Record
Collective annotation of Wikipedia entities in web text

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Open information extraction from the web

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Learning first-order Horn clauses from web text

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Entity disambiguation for knowledge base population

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Large-scale cross-document coreference using distributed inference and hierarchical models

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Local and global algorithms for disambiguation to Wikipedia

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Robust disambiguation of named entities in text

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Identifying relations for open information extraction

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing

Linking named entities to any database

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Re-ranking for joint named-entity recognition and linking

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Using natural language to integrate, evaluate, and optimize extracted knowledge bases

Proceedings of the 2013 workshop on Automated knowledge base construction
Exploring re-ranking approaches for joint named-entityrecognition and linking

Proceedings of the sixth workshop on Ph.D. students in information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper investigates entity linking over millions of high-precision extractions from a corpus of 500 million Web documents, toward the goal of creating a useful knowledge base of general facts. This paper is the first to report on entity linking over this many extractions, and describes new opportunities (such as corpus-level features) and challenges we found when entity linking at Web scale. We present several techniques that we developed and also lessons that we learned. We envision a future where information extraction and entity linking are paired to automatically generate knowledge bases with billions of assertions over millions of linked entities.