Applying biclustering to text mining: an immune-inspired approach

  • Authors:
  • Pablo A. D. de Castro;Fabrício O. de França;Hamilton M. Ferreira;Fernando J. Von Zuben

  • Affiliations:
  • Laboratory of Bioinformatics and Bio-Inspired Computing, School of Electrical and Computer Engineer, University of Campinas, Campinas, SP, Brazil;Laboratory of Bioinformatics and Bio-Inspired Computing, School of Electrical and Computer Engineer, University of Campinas, Campinas, SP, Brazil;Laboratory of Bioinformatics and Bio-Inspired Computing, School of Electrical and Computer Engineer, University of Campinas, Campinas, SP, Brazil;Laboratory of Bioinformatics and Bio-Inspired Computing, School of Electrical and Computer Engineer, University of Campinas, Campinas, SP, Brazil

  • Venue:
  • ICARIS'07 Proceedings of the 6th international conference on Artificial immune systems
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the rapid development of information technology, computers are proving to be a fundamental tool for the organization and classification of electronic texts, given the huge amount of available information. The existent methodologies for text mining apply standard clustering algorithms to group similar texts. However, these algorithms generally take into account only the global similarities between the texts and assign each one to only one cluster, limiting the amount of information that can be extracted from the texts. An alternative proposal capable of solving these drawbacks is the biclustering technique. The biclustering is able to perform clustering of rows and columns simultaneously, allowing a more comprehensive analysis of the texts. The main contribution of this paper is the development of an immune-inspired biclustering algorithm to carry out text mining, denoted BIC-aiNet. BIC-aiNet interprets the biclustering problem as several two-way bipartition problems, instead of considering a single two-way permutation framework. The experimental results indicate that our proposal is able to group similar texts efficiently and extract implicit useful information from groups of texts.