Hierarchical hidden Markov models for information extraction

  • Authors:
  • Marios Skounakis;Mark Craven;Soumya Ray

  • Affiliations:
  • Department of Computer Sciences, University of Wisconsin, Madison, Wisconsin;Department of Biostatistics & Medical Informatics, University of Wisconsin, Madison, Wisconsin;Department of Computer Sciences, University of Wisconsin, Madison, Wisconsin

  • Venue:
  • IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Information extraction can be defined as the task of automatically extracting instances of specified classes or relations from text. We consider the case of using machine learning methods to induce models for extracting relation instances from biomedical articles. We propose and evaluate an approach that is based on using hierarchical hidden Markov models to represent the grammatical structure of the sentences being processed. Our approach first uses a shallow parser to construct a multi-level representation of each sentence being processed. Then we train hierarchical HMMs to capture the regularities of the parses for both positive and negative sentences. We evaluate our method by inducing models to extract binary relations in three biomedical domains. Our experiments indicate that our approach results in more accurate models than several baseline HMM approaches.