Segment-based hidden Markov models for information extraction

Authors:
Zhenmei Gu;Nick Cercone
Affiliations:
University of Waterloo, Waterloo, Ontario, Canada;Dalhousie University, Halifax, Nova Scotia, Canada
Venue:
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Year:
2006

Citing 4
Cited 3

Maximum Entropy Markov Models for Information Extraction and Segmentation

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Nymble: a high-performance learning name-finder

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Bayesian information extraction network

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Efficiently inducing features of conditional random fields

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

Peeling back the layers: detecting event role fillers in secondary contexts

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Bootstrapped training of event extraction classifiers

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Discourse structure and language technology

Natural Language Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hidden Markov models (HMMs) are powerful statistical models that have found successful applications in Information Extraction (IE). In current approaches to applying HMMs to IE, an HMM is used to model text at the document level. This modelling might cause undesired redundancy in extraction in the sense that more than one filler is identified and extracted. We propose to use HMMs to model text at the segment level, in which the extraction process consists of two steps: a segment retrieval step followed by an extraction step. In order to retrieve extraction-relevant segments from documents, we introduce a method to use HMMs to model and retrieve segments. Our experimental results show that the resulting segment HMM IE system not only achieves near zero extraction redundancy, but also has better overall extraction performance than traditional document HMM IE systems.