The human genome project and informatics
Communications of the ACM
Artificial intelligence and molecular biology
Artificial intelligence and molecular biology
Combinatorial pattern discovery for scientific data: some preliminary results
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
The nature of statistical learning theory
The nature of statistical learning theory
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
A simple, fast, and effective rule learner
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Improved Boosting Algorithms Using Confidence-rated Predictions
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
BoosTexter: A Boosting-based Systemfor Text Categorization
Machine Learning - Special issue on information retrieval
Mining needle in a haystack: classifying rare classes via two-phase rule induction
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
A Comparative Study of Cost-Sensitive Boosting Algorithms
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Frequent-subsequence-based prediction of outer membrane proteins
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Developing a Database for Proteomic Analysis of Extracytosolic Plant Proteins
DEXA '04 Proceedings of the Database and Expert Systems Applications, 15th International Workshop
Hi-index | 0.00 |
Extracellular plant proteins are involved in numerous pro- cesses including nutrient acquisition, communication with other soil organisms, protection from pathogens, and resistance to disease and toxic metals. Insofar as these proteins are strategically positioned to play a role in resistance to environmental stress, biologists are interested in proteomic tools in analyzing extracellular proteins. In this paper, we present three methods using frequent subsequences of amino acids: one based on support vector machines (SVM), one based on boosting and FSP, a new frequent subsequence pattern method. We test our methods on a plant dataset and the experimental results show that our methods perform better than the existing approaches based on amino acid composition.