Granules of words to represent text: an approach based on fuzzy relations and spectral clustering

  • Authors:
  • Patrícia F. Castro;Geraldo B. Xexéo

  • Affiliations:
  • Departamento de Engenharia de Sistemas e Computação, COPPE/UFRJ, Rio de Janeiro, Brasil;Departamento de Engenharia de Sistemas e Computação, COPPE/UFRJ, Rio de Janeiro, Brasil,Departamento de Ciência da Computação, IM/UFRJ, Rio de Janeiro, Brasil

  • Venue:
  • ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part IV
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The amount of data available in semi-structured or unstructured format grows exponentially. The area of text mining aims at discovering knowledge from data of this type. Most work in this area uses the model known as bag of words to represent the texts. This form of representation, although effective, minimizes the quality of knowledge discovered because it is not able to capture essential characteristics of this type of data such as semantics and context. The paradigm of granular computing has been shown effective in the treatment of complex problems of information processing and can produce significant results in large-scale environments such as the Web. This paper explores the granulation process of words with a view to its application in the subsequent improvement in text representation. We use fuzzy relations and spectral clustering in this process and present some results.