Weakly Supervised Approaches for Ontology Population

Authors:
Hristo Tanev;Bernardo Magnini
Affiliations:
IPSC-JRC, Ispra, Italy;ITC-irst, Trento, Italy
Venue:
Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
Year:
2008

Citing 8
Cited 5

Automatic Ontology-Based Knowledge Extraction from Web Documents

IEEE Intelligent Systems
Kernel methods for relation extraction

The Journal of Machine Learning Research
Expanding domain-specific lexicons by term categorization

Proceedings of the 2003 ACM symposium on Applied computing
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Fine grained classification of named entities

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Information Extraction and Semantic Annotation of Wikipedia

Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
Automatically Harvesting and Ontologizing Semantic Relations

Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
Searching for common sense: populating Cyc™ from the web

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3

The TERMINAE Method and Platform for Ontology Engineering from Texts

Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
Ontology extension and population: an approach for the pharmacotherapeutic domain

NLDB'11 Proceedings of the 16th international conference on Natural language processing and information systems
A graph-based approach for ontology population with named entities

Proceedings of the 21st ACM international conference on Information and knowledge management
Knowledge extraction based on discourse representation theory and linguistic frames

EKAW'12 Proceedings of the 18th international conference on Knowledge Engineering and Knowledge Management
Automatic typing of DBpedia entities

ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I

Quantified Score

Hi-index	0.02

Visualization

Abstract

We present a weakly supervised approach to automatic ontology population from text and compare it with two other unsupervised approaches. In our experiments we populate a part of our ontology of Named Entities. We considered two high level categories-geographical locations and person names and ten sub-classes for each category. For each sub-class we automatically learn a syntactic model from a list of training examples and a parsed corpus. A novel syntactic indexing method allowed us to use large quantities of syntactically annotated data. The syntactic model for each named entity sub-class is a set of weighted syntactic features, i.e. words which typically co-occur with the members of the class in the corpus. The method is weakly supervised, since no manually annotated corpus is used in the learning process. The syntactic models are used to classify the unknown Named Entities in the test set. The method achieved promising results, i.e. 65% accuracy, and outperforms significantly the other two approaches.