Extracting protein sub-cellular localizations from literature

  • Authors:
  • Hong-Woo Chun;Jin-Dong Kim;Yun-Soo Choi;Won-Kyung Sung

  • Affiliations:
  • Korea Institute of Science and Technology Information, Daejeon, Republic of Korea;Database Center for Life Science, Research Organization of Information and System, Japan;Korea Institute of Science and Technology Information, Daejeon, Republic of Korea;Korea Institute of Science and Technology Information, Daejeon, Republic of Korea

  • Venue:
  • AMT'10 Proceedings of the 6th international conference on Active media technology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Protein Sub-cellular Localization (PSL) prediction is an important task for predicting protein functions. Because the sequence-based approach used in the most previous work has focused on prediction of locations for given proteins, it failed to provide useful information for the cases in which single proteins are localized, depending on their states in progress, in several different sub-cellular locations. While it is difficult for the sequence-based approach, it can be tackled by the text-based approach. The proposed approach extracts PSL from literature using Natural Language Processing techniques. We conducted experiments to see how our system performs in identification of evidence sentences and what linguistic features from sentences significantly contribute to the task. This article presents a text-based novel approach to extract PSL relations with their evidence sentences. Evidence sentences will provide indispensable pieces of information that the sequence-based approach cannot supply.