KXtractor: an effective biomedical information extraction technique based on mixture hidden markov models

Authors:
Min Song;Il-Yeol Song;Xiaohua Hu;Robert B. Allen
Affiliations:
College of Information Science and Technology, Drexel University, Philadelphia, PA;College of Information Science and Technology, Drexel University, Philadelphia, PA;College of Information Science and Technology, Drexel University, Philadelphia, PA;College of Information Science and Technology, Drexel University, Philadelphia, PA
Venue:
Transactions on Computational Systems Biology II
Year:
2005

Citing 9
Cited 1

Support-Vector Networks

Machine Learning
Relational learning of pattern-match rules for information extraction

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions

Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
A Pragmatic Information Extraction Strategy for Gathering Data on Genetic Interactions

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
KPSpotter: a flexible information gain-based keyphrase extraction system

WIDM '03 Proceedings of the 5th ACM international workshop on Web information and data management
Extracting the names of genes and gene products with a hidden Markov model

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Use of support vector learning for chunk identification

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Hierarchical hidden Markov models for information extraction

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Comparative experiments on learning information extractors for proteins and their interactions

Artificial Intelligence in Medicine

Protein interaction detection in sentences via Gaussian Processes: a preliminary evaluation

International Journal of Data Mining and Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a novel information extraction (IE) technique, KXtractor, which combines a text chunking technique and Mixture Hidden Markov Models (MiHMM). KXtractor overcomes the problem of the single Part-Of-Speech (POS) HMMs with modeling the rich representation of text where features overlap among state units such as word, line, sentence, and paragraph. KXtractor also resolves issues with the traditional HMMs for IE that operate only on the semi-structured data such as HTML documents and other text sources in which language grammar does not play a pivotal role. We compared KXtractor with three IE techniques: 1) RAPIER, an inductive learning-based machine learning system, 2) a Dictionary-based extraction system, and 3) single POS HMM. Our experiments showed that KXtractor outperforms these three IE systems in extracting protein-protein interactions. In our experiments, the F-measure for KXtractor was higher than for RAPIER, a dictionary-based system, and single POS HMM respectively by 16.89%, 16.28%, and 8.58%. In addition, both precision and recall of KXtractor are higher than those systems.