Effectiveness of Information Extraction, Multi-Relational, and Semi-Supervised Learning for Predicting Functional Properties of Genes

  • Authors:
  • Mark-A. Krogel;Tobias Scheffer

  • Affiliations:
  • -;-

  • Venue:
  • ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We focus on the problem of predicting functional propertiesof the proteins corresponding to genes in the yeastgenome. Our goal is to study the effectiveness of approachesthat utilize all data sources that are availablein this problem setting, including unlabeled and relationaldata, and abstracts of research papers. We study transductionand co-training for using unlabeled data. We investigatea propositionalization approach which uses relationalgene interaction data. We study the benefit of informationextraction for utilizing a collection of scientific abstracts.The studied tasks are KDD Cup tasks of 2001 and 2002.The solutions which we describe achieved the highest scorefor task 2 in 2001, the fourth rank for task 3 in 2001, thehighest score for one of the two subtasks and the third placefor the overall task 2 in 2002.