k-Neighborhood decentralization: A comprehensive solution to index the UMLS for large scale knowledge discovery

  • Authors:
  • Yang Xiang;Kewei Lu;Stephen L. James;Tara B. Borlawsky;Kun Huang;Philip R. O. Payne

  • Affiliations:
  • Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, United States;Department of Computer Science and Engineering, The Ohio State University, Columbus, OH 43210, United States;Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, United States;Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, United States;Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, United States and OSUCCC Biomedical Informatics Shared Resource, The Ohio State University, Columbus, OH 43210, ...;Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, United States and OSUCCC Biomedical Informatics Shared Resource, The Ohio State University, Columbus, OH 43210, ...

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Unified Medical Language System (UMLS) is the largest thesaurus in the biomedical informatics domain. Previous works have shown that knowledge constructs comprised of transitively-associated UMLS concepts are effective for discovering potentially novel biomedical hypotheses. However, the extremely large size of the UMLS becomes a major challenge for these applications. To address this problem, we designed a k-neighborhood Decentralization Labeling Scheme (kDLS) for the UMLS, and the corresponding method to effectively evaluate the kDLS indexing results. kDLS provides a comprehensive solution for indexing the UMLS for very efficient large scale knowledge discovery. We demonstrated that it is highly effective to use kDLS paths to prioritize disease-gene relations across the whole genome, with extremely high fold-enrichment values. To our knowledge, this is the first indexing scheme capable of supporting efficient large scale knowledge discovery on the UMLS as a whole. Our expectation is that kDLS will become a vital engine for retrieving information and generating hypotheses from the UMLS for future medical informatics applications.