A Multi-Level Text Mining Method to Extract Biological Relationships

  • Authors:
  • Mathew Palakal;Matthew Stephens;Snehasis Mukhopadhyay;Rajeev Raje;Simon Rhodes

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Accurate and computationally efficient approaches in discovering relationships between biological objects from text documents are important for biologists to develop biological models. This paper presents a novel approach to extract relationships between multiplebiological objects that are present in a text document. The approach involves object identification, reference resolution, ontology and synonym discovery, and extracting object-object relationships. Hidden Markov Models (HMMs), dictionaries, and N-Gram models are used to set the framework to tackle the complex task of extracting object-object relationships. Experiments were carried out using a corpus of one thousand Medline abstracts. Intermediate results were obtained for the object identification process, synonym discovery, and finally the relationship extraction. For a corpus of thousand abstracts, 53 relationships were extracted of which 43 were correct, giving a specificity of 81%. The approach is both adaptable and scalable to new problems as opposed to rule-based methods.