Robust prediction from multiple heterogeneous data sources with partial information

  • Authors:
  • Mohammad S. Aziz;Chandan K. Reddy

  • Affiliations:
  • Wayne State University, Detroit, MI, USA;Wayne State University, Detroit, MI, USA

  • Venue:
  • CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Significant research efforts for robust integration of information from multiple sources are being pursued at a rapid pace. However, the information in heterogeneous sources is often incomplete and hence making the maximum use of all the available information is a challenging problem. Most of the recent research on data integration have been primarily focused on the cases where the information is available across all the different sources and can not effectively integrate sources in the presence of partial information. We develop an ensemble method that boosts the decisions made from different models on individual sources and obtain robust results for the task of class prediction. We propose a heterogeneous boosting framework that uses all the available information even if some of the sources do not provide any information about some objects. We demonstrate the effectiveness of the proposed framework for the problem of gene function prediction and compare to the state-of-the-art methods using several real-world biological datasets. We also show that the proposed method outperforms any kind of imputation schemes that are widely used while integrating data with partial information