Ontology-based information content computation

  • Authors:
  • David Sánchez;Montserrat Batet;David Isern

  • Affiliations:
  • Intelligent Technologies for Advanced Knowledge Acquisition (ITAKA), Departament d'Enginyeria Informítica i Matemítiques, Universitat Rovira i Virgili, Avda. Països Catalans, 26. 43 ...;Intelligent Technologies for Advanced Knowledge Acquisition (ITAKA), Departament d'Enginyeria Informítica i Matemítiques, Universitat Rovira i Virgili, Avda. Països Catalans, 26. 43 ...;Intelligent Technologies for Advanced Knowledge Acquisition (ITAKA), Departament d'Enginyeria Informítica i Matemítiques, Universitat Rovira i Virgili, Avda. Països Catalans, 26. 43 ...

  • Venue:
  • Knowledge-Based Systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The information content (IC) of a concept provides an estimation of its degree of generality/concreteness, a dimension which enables a better understanding of concept's semantics. As a result, IC has been successfully applied to the automatic assessment of the semantic similarity between concepts. In the past, IC has been estimated as the probability of appearance of concepts in corpora. However, the applicability and scalability of this method are hampered due to corpora dependency and data sparseness. More recently, some authors proposed IC-based measures using taxonomical features extracted from an ontology for a particular concept, obtaining promising results. In this paper, we analyse these ontology-based approaches for IC computation and propose several improvements aimed to better capture the semantic evidence modelled in the ontology for the particular concept. Our approach has been evaluated and compared with related works (both corpora and ontology-based ones) when applied to the task of semantic similarity estimation. Results obtained for a widely used benchmark show that our method enables similarity estimations which are better correlated with human judgements than related works.