Building a large scale knowledge base from chinese wiki encyclopedia

Authors:
Zhichun Wang;Zhigang Wang;Juanzi Li;Jeff Z. Pan
Affiliations:
Department of Computer Science and Technology, Tsinghua University, China;Department of Computer Science and Technology, Tsinghua University, China;Department of Computer Science and Technology, Tsinghua University, China;Department of Computer Science, The University of Aberdeen, UK
Venue:
JIST'11 Proceedings of the 2011 joint international conference on The Semantic Web
Year:
2011

Citing 11
Cited 0

Ontology Matching

Ontology Matching
Yago: a core of semantic knowledge

Proceedings of the 16th international conference on World Wide Web
Automatically refining the wikipedia infobox ontology

Proceedings of the 17th international conference on World Wide Web
Freebase: a collaboratively created graph database for structuring human knowledge

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
YAGO: A Large Ontology from Wikipedia and WordNet

Web Semantics: Science, Services and Agents on the World Wide Web
The YAGO-NAGA approach to knowledge discovery

ACM SIGMOD Record
Deriving a large scale taxonomy from Wikipedia

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
DBpedia - A crystallization point for the Web of Data

Web Semantics: Science, Services and Agents on the World Wide Web
DBpedia: a nucleus for a web of open data

ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
MENTA: inducing multilingual taxonomies from wikipedia

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
dbrec: music recommendations using DBpedia

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

DBpedia has been proved to be a successful structured knowledge base, and large scale Semantic Web data has been built by using DBpedia as the central interlinking-hubs of the Web of Data in English. But in Chinese, due to the heavily imbalance in size (no more than one tenth) between English and Chinese in Wikipedia, there are few Chinese linked data are published and linked to DBpedia, which hinders the structured knowledge sharing both within Chinese resources and cross-lingual resources. This paper aims at building large scale Chinese structured knowledge base from Hudong, which is one of the largest Chinese Wiki Encyclopedia websites. In this paper, an upper-level ontology schema in Chinese is first learned based on the category system and Infobox information in Hudong. Totally, there are 19542 concepts are inferred, which are organized in hierarchy with maximally 20 levels. 2381 properties with domain and range information are learned according to the attributes in the Hudong Infoboxes. Then, 802593 instances are extracted and described using the concepts and properties in the learned ontology. These extracted instances cover a wide range of things, including persons, organizations, places and so on. Among all the instances, 62679 of them are linked to identical instances in DBpedia. Moreover, the paper provides RDF dump or SPARQL to access the established Chinese knowledge base. The general upper-level ontology and wide coverage makes the knowledge base a valuable Chinese semantic resource. It not only can be used in Chinese linked data building, the fundamental work for building multi lingual knowledge base across heterogeneous resources of different languages, but also can largely facilitate many useful applications of large-scale knowledge base such as knowledge question-answering and semantic search.