Feature engineering for a gene regulation prediction task

  • Authors:
  • George Forman

  • Affiliations:
  • HP Labs, Palo Alto, CA

  • Venue:
  • ACM SIGKDD Explorations Newsletter
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes an approach that won honorable mention for the gene regulation prediction task of the 2002 KDD Cup competition [1]. Our methodology used extensive cross-validation to direct the search for an appropriate problem representation and the selection of an 'off-the-shelf' induction algorithm. A prominent trait of the dataset is the presence of three hierarchical attributes, for each of which we generated a novel predictive feature: the percentage of positives hierarchically aggregated at the node specified by the instance.