Extracting unrecognized gene relationships from the biomedical literature via matrix factorizations using a priori knowledge of gene relationships

  • Authors:
  • Hyunsoo Kim;Haesun Park

  • Affiliations:
  • Georgia Institute of Technology, Atlanta, GA;Georgia Institute of Technology, Atlanta, GA

  • Venue:
  • TMBIO '06 Proceedings of the 1st international workshop on Text mining in bioinformatics
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The construction of literature-based networks of gene-gene interactions is one of the most important applications of text mining in bioinformatics. Extracting potential gene relationships from the biomedical literature may be helpful in building biological hypotheses that can be explored further experimentally. In this paper, we explore the utility of singular value decomposition (SVD) and non-negative matrix factorization (NMF) to extract unrecognized gene relationships from the biomedical literature by taking advantage of known gene relationships. We introduce a way to incorporate a priori knowledge of gene relationships into LSI/SVD and NMF. In addition, we propose a gene retrieval method based on NMF (GR/NMF), which shows comparable performance with latent semantic indexing based on SVD.