Introduction to artificial neural systems
Introduction to artificial neural systems
Symbolic knowledge and neural networks: insertion, refinement and extraction
Symbolic knowledge and neural networks: insertion, refinement and extraction
Rapid identification of repeated patterns in strings, trees and arrays
STOC '72 Proceedings of the fourth annual ACM symposium on Theory of computing
Biological sequences encoding for supervised classification
BIRD'07 Proceedings of the 1st international conference on Bioinformatics research and development
Feature extraction in protein sequences classification: a new stability measure
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Hi-index | 0.00 |
An encoding method has a direct effect on the quality and the representation of the discovered knowledge in data mining systems. Biological macromolecules are encoded by strings of characters, called primary structures. Knowing that data mining systems usually use relational tables to encode data, we have then to reencode these strings and transform them into relational tables. In this paper, we do a comparative study of the existing static encoding methods, that are based on the Biologist know-how, and our new dynamic encoding one, that is based on the construction of Discriminant and Minimal Substrings(DMS). Different classification methods are used to do this study. The experimental results show that our dynamic encoding method is more efficient than the static ones, to encode biological macromolecules within a data mining perspective.