Encoding of primary structures of biological macromolecules within a data mining perspective

  • Authors:
  • Mondher Maddouri;Mourad Elloumi

  • Affiliations:
  • Computer Science Department, National Institute of Applied Sciences and Technologies, Tunis-Carthage 2035 Tunis, Tunisia;Computer Science Department, Faculty of Economic Sciences and Management of Tunis, El Manar 2092 Tunisia, Tunisia

  • Venue:
  • Journal of Computer Science and Technology - Special issue on bioinformatics
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

An encoding method has a direct effect on the quality and the representation of the discovered knowledge in data mining systems. Biological macromolecules are encoded by strings of characters, called primary structures. Knowing that data mining systems usually use relational tables to encode data, we have then to reencode these strings and transform them into relational tables. In this paper, we do a comparative study of the existing static encoding methods, that are based on the Biologist know-how, and our new dynamic encoding one, that is based on the construction of Discriminant and Minimal Substrings(DMS). Different classification methods are used to do this study. The experimental results show that our dynamic encoding method is more efficient than the static ones, to encode biological macromolecules within a data mining perspective.