High-dimensional micro-array data classification using minimum description length and domain expert knowledge

  • Authors:
  • Andrea Bosin;Nicoletta Dessì;Barbara Pes

  • Affiliations:
  • Dipartimento di Matematica e Informatica, Università degli Studi di Cagliari, Cagliari;Dipartimento di Matematica e Informatica, Università degli Studi di Cagliari, Cagliari;Dipartimento di Matematica e Informatica, Università degli Studi di Cagliari, Cagliari

  • Venue:
  • IEA/AIE'06 Proceedings of the 19th international conference on Advances in Applied Artificial Intelligence: industrial, Engineering and Other Applications of Applied Intelligent Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper reports on three machine learning methods, i.e. Naïve Bayes (NB), Adaptive Bayesian Network (ABN) and Support Vector Machines (SVM) for multi-target classification on micro-array datasets involving a large feature space and very few samples. By adopting the Minimum Description Length criterion for ranking and selecting relevant features, experiments are carried out to investigate the accuracy and effectiveness of the above methods in classifying many targets as well as to study the effects of feature selection on the sensitivity of each classifier. The paper also shows how the knowledge of a domain expert makes it possible to decompose the multi-target classification in a set of binary classifications, one for each target, with a substantial improvement in accuracy. The effectiveness of the MDL criterion to decide on particular feature subsets is asserted by empirical results showing that MDL is comparable with entropy based feature selection methodologies reported by earlier works.