Discovering patterns to extract protein-protein interactions from full biomedical texts

  • Authors:
  • Minlie Huang;Xiaoyan Zhu;Donald G. Payan;Kunbin Qu;Ming Li

  • Affiliations:
  • University of Tsinghua, Beijing, China;University of Tsinghua, Beijing, China;Rigel Pharmaceuticals Inc., South San Francisco, CA;Rigel Pharmaceuticals Inc., South San Francisco, CA;University of Waterloo, Canada and University of Tsinghua, Beijing, China

  • Venue:
  • JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Although there have been many research projects to extract protein pathways, most such information still exists only in the scientific literature, usually written in natural languages and defying data mining efforts. We present a novel and robust approach for extracting protein-protein interactions from the literature. Our method uses a dynamic programming algorithm to compute distinguishing patterns by aligning relevant sentences and key verbs that describe protein interactions. A matching algorithm is designed to extract the interactions between proteins. Equipped only with a protein name dictionary, our system achieves a recall rate of about 80.0% and a precision rate of about 80.5%.