Improving robustness and flexibility of concept taxonomy learning from text

  • Authors:
  • Fabio Leuzzi;Stefano Ferilli;Fulvio Rotella

  • Affiliations:
  • Dipartimento di Informatica, Università di Bari, Italy;Dipartimento di Informatica, Università di Bari, Italy,Centro Interdipartimentale per la Logica e sue Applicazioni, Università di Bari, Italy;Dipartimento di Informatica, Università di Bari, Italy

  • Venue:
  • NFMCP'12 Proceedings of the First international conference on New Frontiers in Mining Complex Patterns
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The spread and abundance of electronic documents requires automatic techniques for extracting useful information from the text they contain. The availability of conceptual taxonomies can be of great help, but manually building them is a complex and costly task. Building on previous work, we propose a technique to automatically extract conceptual graphs from text and reason with them. Since automated learning of taxonomies needs to be robust with respect to missing or partial knowledge and flexible with respect to noise, this work proposes a way to deal with these problems. The case of poor data/sparse concepts is tackled by finding generalizations among disjoint pieces of knowledge. Noise is handled by introducing soft relationships among concepts rather than hard ones, and applying a probabilistic inferential setting. In particular, we propose to reason on the extracted graph using different kinds of relationships among concepts, where each arc/relationship is associated to a weight that represents its likelihood among all possible worlds, and to face the problem of sparse knowledge by using generalizations among distant concepts as bridges between disjoint portions of knowledge.