Semantic Feature Selection Using WordNet

  • Authors:
  • Stephanie Chua;Narayanan Kulathuramaiyer

  • Affiliations:
  • Universiti Malaysia Sarawak;Universiti Malaysia Sarawak

  • Venue:
  • WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The web has caused an explosion of documents, requiring the need for an automated text categorization system. This paper explores the notion of semantic feature selection by employing WordNet [Introduction to WordNet: An On-line Lexical Database], a lexical database. The proposed semantic approach employs noun synonyms and word senses for feature selection to select terms that are semantically representative of a category of documents. The categorical sense disambiguation extends the use of WordNet, which has been typically used for text retrieval and word sense disambiguation [A WordNet-based Algorithm for Word Sense Disambiguation]. Our experiments on the Reuters-21578 dataset have shown that automated semantic feature selection is able to perform better than well known statistical feature selection methods, Information Gain and Chi-Square as a feature selection method.