Semantic Labeling of Data by Using the Web

  • Authors:
  • Leonardo Rigutini;Ernesto Di Iorio;Marco Ernandes;Marco Maggini

  • Affiliations:
  • Universita di Siena, Italy;Universita di Siena, Italy;Universita di Siena, Italy;Universita di Siena, Italy

  • Venue:
  • WI-IATW '06 Proceedings of the 2006 IEEE/WIC/ACM international conference on Web Intelligence and Intelligent Agent Technology
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a system for automatically categorizing terms or lexical entities into a predefined set of semantic domains. We present an approach that exploits the knowledge available in the Web to create a model of each term or entity (Entity Context Lexicons - ECLs). Each profile is simply a list of terms (similar to the Bag-Of-Words representation in text categorization) and it is composed primarily by the words often appearing in the same contexts of the entity. These profiles model the contexts in which the entity usually appears and they can be subsequently processed by an automatic classifier. Moreover, we propose and validate a profile-based categorization model developed for this particular task which uses the ECLs of the training entities to build a profile for each class (Class- Context lexicon - CCL). Finally, we propose a technique for dealing with multi-label classification based on a decision module that exploits a neural network. We show the effectiveness of the proposed approach on a term categorization task using a standard benchmark composed of a set of domain-specific lexicons (WordNetDomains).