Improving robustness and flexibility of concept taxonomy learning from text

Authors:
Fabio Leuzzi;Stefano Ferilli;Fulvio Rotella
Affiliations:
Dipartimento di Informatica, Università di Bari, Italy;Dipartimento di Informatica, Università di Bari, Italy,Centro Interdipartimentale per la Logica e sue Applicazioni, Università di Bari, Italy;Dipartimento di Informatica, Università di Bari, Italy
Venue:
NFMCP'12 Proceedings of the First international conference on New Frontiers in Mining Complex Patterns
Year:
2012

Citing 8
Cited 0

A Formal Ontology Discovery from Web Documents

WI '01 Proceedings of the First Asia-Pacific Conference on Web Intelligence: Research and Development
Mining Ontologies from Text

EKAW '00 Proceedings of the 12th European Workshop on Knowledge Acquisition, Modeling and Management
Verbs semantics and lexical selection

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Probabilistic Explanation Based Learning

ECML '07 Proceedings of the 18th European conference on Machine Learning
On the Efficient Execution of ProbLog Programs

ICLP '08 Proceedings of the 24th International Conference on Logic Programming
Learning concept hierarchies from text corpora using formal concept analysis

Journal of Artificial Intelligence Research
ProbLog: a probabilistic prolog and its application in link discovery

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Plugging Taxonomic Similarity in First-Order Logic Horn Clauses Comparison

AI*IA '09: Proceedings of the XIth International Conference of the Italian Association for Artificial Intelligence Reggio Emilia on Emergent Perspectives in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

The spread and abundance of electronic documents requires automatic techniques for extracting useful information from the text they contain. The availability of conceptual taxonomies can be of great help, but manually building them is a complex and costly task. Building on previous work, we propose a technique to automatically extract conceptual graphs from text and reason with them. Since automated learning of taxonomies needs to be robust with respect to missing or partial knowledge and flexible with respect to noise, this work proposes a way to deal with these problems. The case of poor data/sparse concepts is tackled by finding generalizations among disjoint pieces of knowledge. Noise is handled by introducing soft relationships among concepts rather than hard ones, and applying a probabilistic inferential setting. In particular, we propose to reason on the extracted graph using different kinds of relationships among concepts, where each arc/relationship is associated to a weight that represents its likelihood among all possible worlds, and to face the problem of sparse knowledge by using generalizations among distant concepts as bridges between disjoint portions of knowledge.