Automatic domain-ontology structure and example acquisition from semi-structured texts

Authors:
Cheng Xiao;Dequan Zheng;Yuhang Yang
Affiliations:
MOE-MS Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology, Harbin;MOE-MS Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology, Harbin;MOE-MS Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology, Harbin
Venue:
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 7
Year:
2009

Citing 8
Cited 0

Extracting Patterns and Relations from the World Wide Web

WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Learning ontologies from natural language texts

International Journal of Human-Computer Studies
Mining tables from large scale HTML texts

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Dependency tree kernels for relation extraction

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Organizing and searching the world wide web of facts -- step two: harnessing the wisdom of the crowds

Proceedings of the 16th international conference on World Wide Web
What you seek is what you get: extraction of class attributes from query logs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Automatic discovery of attribute words from web documents

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a new method to acquire Domain-Ontology structure and examples from semi-structured data sources. Firstly, extract Domain-Ontology structure, including candidate attributes extraction using certain patterns and applying a statistic method to filter out the incorrect attributes. Secondly, using Domain-Ontology structure as a clue, automatically generate example extraction patterns. Finally, acquire Ontology examples taking advantage of the special structure feature of the Web pages. Experiments are carried out in the field of film, the precision of the Ontology structure extraction is 83.7%, and the highest recall of the examples extraction reaches 90%. Experimental results demonstrate that the method developed in this paper is fairly efficient.