An ontology-based pattern mining system for extracting information from biological texts

  • Authors:
  • Muhammad Abulaish;Lipika Dey

  • Affiliations:
  • Department of Mathematics, Jamia Millia Islamia (A Central University), New Delhi, India;Department of Mathematics, Indian Institute of Technology, Delhi, Hauz Khas, New Delhi, India

  • Venue:
  • RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part II
  • Year:
  • 2005

Quantified Score

Hi-index 0.01

Visualization

Abstract

Biological information embedded within the large repository of unstructured or semi-structured text documents can be extracted more efficiently through effective semantic analysis of the texts in collaboration with structured domain knowledge. The GENIA corpus houses tagged MEDLINE abstracts, manually annotated according to the GENIA ontology, for this purpose. However, manual tagging of all texts is impossible and special purpose storage and retrieval mechanisms are required to reduce information overload for users. In this paper we have proposed an ontology-based biological Information Extraction and Query Answering (BIEQA) system that has four components: an ontology-based tag analyzer for analyzing tagged texts to extract Biological and lexical patterns, an ontology-based tagger for tagging new texts, a knowledge base enhancer which enhances the ontology, and incorporates new knowledge in the form of biological entities and relationships into the knowledge base, and a query processor for handling user queries.