Mining Concepts from Wikipedia for Ontology Construction

Authors:
Gaoying Cui;Qin Lu;Wenjie Li;Yirong Chen
Affiliations:
-;-;-;-
Venue:
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Year:
2009

Citing 7
Cited 3

Towards a standard upper ontology

Proceedings of the international conference on Formal Ontology in Information Systems - Volume 2001
Mining topic-specific concepts and definitions on the web

WWW '03 Proceedings of the 12th international conference on World Wide Web
Learning Domain Ontologies from Document Warehouses and Dedicated Web Sites

Computational Linguistics
Enriching the knowledge sources used in a maximum entropy part-of-speech tagger

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Ontology learning: state of the art and open issues

Information Technology and Management
Concept vector extraction from Wikipedia category network

Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication
Distinguishing between instances and classes in the wikipedia taxonomy

ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications

Building ontological models from Arabic Wikipedia: a proposed hybrid approach

Proceedings of the 12th International Conference on Information Integration and Web-based Applications & Services
A wikipedia based semantic graph model for topic tracking in blogosphere

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
PSG: a two-layer graph model for document summarization

Frontiers of Computer Science: Selected Publications from Chinese Universities

Quantified Score

Hi-index	0.00

Visualization

Abstract

An ontology is a structured knowledgebase of concepts organized by relations among them. But concepts are usually mixed with their instances in the corpora for knowledge extraction. Concepts and their corresponding instances share similar features and are difficult to distinguish. In this paper, a novel approach is proposed to comprehensively obtain concepts with the help of definition sentences and Category Labels in Wikipedia pages. N-gram statistics and other NLP knowledge are used to help extracting appropriate concepts. The proposed method identified nearly 50,000 concepts from about 700,000 Wiki pages. The precision reaching 78.5% makes it an effective approach to mine concepts from Wikipedia for ontology construction.