A system for finding biological entities that satisfy certain conditions from texts

  • Authors:
  • Wei Zhou;Clement Yu;Weiyi Meng

  • Affiliations:
  • University of Illinois at Chicago, Chicago, IL, USA;University of Illinois at Chicago, Chicago, IL, USA;Binghamton University, Binghamton, NY, USA

  • Venue:
  • Proceedings of the 17th ACM conference on Information and knowledge management
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Finding biological entities (such as genes or proteins) that satisfy certain conditions from texts is an important and challenging task in biomedical information retrieval and text mining. It is essential for many biomedical applications, such as drug discovery which normally requires collecting existing scientific facts from documents. This paper presents an effective IR system for this task, in which 1) domain knowledge is incorporated to improve retrieval effectiveness; 2) query expansion with related concepts on multiple semantic levels is employed; 3) a gene symbol disambiguation technique is implemented. We evaluated these techniques and examined two different concept-based IR models. Experiments based upon the proposed framework yield significant improvement (22% for automatic and 16.7% for non-automatic) over the best reported results of passage retrieval in the Genomics track of TREC 2007.