An incremental hypersphere learning framework for protein membership prediction

  • Authors:
  • Noel Lopes;Daniel Correia;Carlos Pereira;Bernardete Ribeiro;António Dourado

  • Affiliations:
  • CISUC - Center for Informatics and Systems of University of Coimbra, Portugal and UDI/IPG - Research Unit, Polytechnic Institute of Guarda, Portugal;CISUC - Center for Informatics and Systems of University of Coimbra, Portugal and ISEC - Coimbra Institute of Engineering, Portugal;CISUC - Center for Informatics and Systems of University of Coimbra, Portugal and ISEC - Coimbra Institute of Engineering, Portugal;CISUC - Center for Informatics and Systems of University of Coimbra, Portugal and Department of Informatics Engineering, University of Coimbra, Portugal;CISUC - Center for Informatics and Systems of University of Coimbra, Portugal and Department of Informatics Engineering, University of Coimbra, Portugal

  • Venue:
  • HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part I
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the recent raise of fast-growing biological databases, it is essential to develop efficient incremental learning algorithms able to extract information efficiently, in particular for constructing protein prediction models. Traditional inference inductive learning models such as SVM perform well when all the data is available. However, they are not suited to cope with the dynamic change of the databases. Recently, a new Incremental Hypersphere Classifier (IHC) Algorithm which performs instance selection has been proved to have impact in online learning settings. In this paper we propose a two-step approach which firstly uses IHC for selecting a reduced data set (and also for immediate prediction), and secondly applies Support Vector Machines (SVM) for protein detection. By retaining the samples that play the most significant role in the construction of the decision surface while removing those that have less or no impact in the model, IHC can be used to efficiently select a reduced data set. Under some conditions, our proposed IHC-SVM approach is able to improve performance accuracy over the baseline SVM for the problem of peptidase detection.