Discovering discovery patterns with predication-based Semantic Indexing

  • Authors:
  • Trevor Cohen;Dominic Widdows;Roger W. Schvaneveldt;Peter Davies;Thomas C. Rindflesch

  • Affiliations:
  • University of Texas Health Science Center, Houston, TX, United States;Microsoft Bing, Redmond, WA, United States;Arizona State University, Mesa, AZ, United States;Center for Translational Cancer Research, Institute of Biosciences and Technology, Texas A&M Health Science Center, Houston, TX, United States;National Library of Medicine, Bethesda, MD, United States

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we utilize methods of hyperdimensional computing to mediate the identification of therapeutically useful connections for the purpose of literature-based discovery. Our approach, named Predication-based Semantic Indexing, is utilized to identify empirically sequences of relationships known as ''discovery patterns'', such as ''drug x INHIBITS substance y, substance y CAUSES disease z'' that link pharmaceutical substances to diseases they are known to treat. These sequences are derived from semantic predications extracted from the biomedical literature by the SemRep system, and subsequently utilized to direct the search for known treatments for a held out set of diseases. Rapid and efficient inference is accomplished through the application of geometric operators in PSI space, allowing for both the derivation of discovery patterns from a large set of known TREATS relationships, and the application of these discovered patterns to constrain search for therapeutic relationships at scale. Our results include the rediscovery of discovery patterns that have been constructed manually by other authors in previous research, as well as the discovery of a set of previously unrecognized patterns. The application of these patterns to direct search through PSI space results in better recovery of therapeutic relationships than is accomplished with models based on distributional statistics alone. These results demonstrate the utility of efficient approximate inference in geometric space as a means to identify therapeutic relationships, suggesting a role of these methods in drug repurposing efforts. In addition, the results provide strong support for the utility of the discovery pattern approach pioneered by Hristovski and his colleagues.