Selection of relevant features and examples in machine learning
Artificial Intelligence - Special issue on relevance
An introduction to variable and feature selection
The Journal of Machine Learning Research
Efficient Feature Selection via Analysis of Relevance and Redundancy
The Journal of Machine Learning Research
Pattern Recognition Letters
Hi-index | 0.00 |
Protein structure prediction consists in determining the thre-e-dimensional conformation of a protein based only on its amino acid sequence. This is currently a difficult and significant challenge in structural bioinformatics because these structures are necessary for drug designing. This work proposes a method that reconstructs protein structures from protein fragments assembled according to their physico-chemical similarities, using information extracted from known protein structures. Our prediction system produces distance maps to represent protein structures, which provides more information than contact maps, which are predicted by many proposals in the literature. Most commonly used amino acid physico-chemical properties are hydrophobicity, polarity and charge. In our method, we performed a feature selection on the 544 properties of the AAindex repository, resulting in 16 properties which were used to predictions. We tested our proposal on 74 mitochondrial matrix proteins with a maximum sequence identity of 30% obtained from the Protein Data Bank. We achieved a recall of 0.80 and a precision of 0.79 with an 8-angstrom cut-off and a minimum sequence separation of 7 amino acids. Finally, we compared our system with other relevant proposal on the same benchmark and we achieved a recall improvement of 50.82%. Therefore, for the studied proteins, our method provides a notable improvement in terms of recall.