LIEGE:: link entities in web lists with knowledge base

Authors:
Wei Shen;Jianyong Wang;Ping Luo;Min Wang
Affiliations:
Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;HP Labs China, Beijing, China;HP Labs China, Beijing, China
Venue:
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2012

Citing 19
Cited 3

Computers and Intractability; A Guide to the Theory of NP-Completeness

Computers and Intractability; A Guide to the Theory of NP-Completeness
An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Extracting structured data from Web pages

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Web-scale information extraction in knowitall: (preliminary results)

Proceedings of the 13th international conference on World Wide Web
Yago: a core of semantic knowledge

Proceedings of the 16th international conference on World Wide Web
Autonomously semantifying wikipedia

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Automatically refining the wikipedia infobox ontology

Proceedings of the 17th international conference on World Wide Web
YAGO: A Large Ontology from Wikipedia and WordNet

Web Semantics: Science, Services and Agents on the World Wide Web
WebTables: exploring the power of tables on the web

Proceedings of the VLDB Endowment
Google's Deep Web crawl

Proceedings of the VLDB Endowment
Collective annotation of Wikipedia entities in web text

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Answering table augmentation queries from unstructured lists on the web

Proceedings of the VLDB Endowment
Harvesting relational tables from lists on the web

Proceedings of the VLDB Endowment
DBpedia: a nucleus for a web of open data

ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Entity disambiguation for knowledge base population

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Annotating and searching web tables using entities, types and relationships

Proceedings of the VLDB Endowment
A generative entity-mention model for linking entities with knowledge base

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Recovering semantics of tables on the web

Proceedings of the VLDB Endowment
LINDEN: linking named entities with knowledge base via semantic knowledge

Proceedings of the 21st international conference on World Wide Web

A graph-based approach for ontology population with named entities

Proceedings of the 21st ACM international conference on Information and knowledge management
Linking named entities in Tweets with knowledge base via user interest modeling

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Large-scale linked data integration using probabilistic reasoning and crowdsourcing

The VLDB Journal — The International Journal on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

A critical step in bridging the knowledge base with the huge corpus of semi-structured Web list data is to link the entity mentions that appear in the Web lists with the corresponding real world entities in the knowledge base, which we call list linking task. This task can facilitate many different tasks such as knowledge base population, entity search and table annotation. However, the list linking task is challenging because a Web list has almost no textual context, and the only input for this task is a list of entity mentions extracted from the Web pages. In this paper, we propose LIEGE, the first general framework to Link the entities in web lists with the knowledge base to the best of our knowledge. Our assumption is that entities mentioned in a Web list can be any collection of entities that have the same conceptual type that people have in mind. To annotate the list items in a Web list with entities that they likely mention, we leverage the prior probability of an entity being mentioned and the global coherence between the types of entities in the Web list. The interdependence between different entity assignments in a Web list makes the optimization of this list linking problem NP-hard. Accordingly, we propose a practical solution based on the iterative substitution to jointly optimize the identification of the mapping entities for the Web list items. We extensively evaluated the performance of our proposed framework over both manually annotated real Web lists extracted from the Web pages and two public data sets, and the experimental results show that our framework significantly outperforms the baseline method in terms of accuracy.