High precision rule based PPI extraction and per-pair basis performance evaluation

Authors:
Junkyu Lee;Seongsoon Kim;Sunwon Lee;Kyubum Lee;Jaewoo Kang
Affiliations:
Korea University, Seoul, South Korea;Korea University, Seoul, South Korea;Korea University, Seoul, South Korea;Korea University, Seoul, South Korea;Korea University, Seoul, South Korea
Venue:
Proceedings of the ACM sixth international workshop on Data and text mining in biomedical informatics
Year:
2012

Citing 14
Cited 1

Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions

Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Kernel methods for relation extraction

The Journal of Machine Learning Research
A shallow parser based on closed-class words to capture relations in biomedical text

Journal of Biomedical Informatics
Discovering patterns to extract protein--protein interactions from full texts

Bioinformatics
A shortest path dependency kernel for relation extraction

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Mining of relations between proteins over biomedical scientific literature using a deep-linguistic approach

Artificial Intelligence in Medicine
RelEx---Relation extraction using dependency parse trees

Bioinformatics
Evaluating contributions of natural language parsers to protein–protein interaction extraction

Bioinformatics
IntEx: a syntactic role driven protein-protein interaction extractor for bio-medical text

ISMB '05 Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics
A rich feature vector for protein-protein interaction extraction from multiple corpora

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Simplicity is better: revisiting single kernel PPI extraction

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Dependency-driven feature-based learning for extracting protein-protein interactions from biomedical text

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
A hybrid approach to extract protein–protein interactions

Bioinformatics
Hash Subgraph Pairwise Kernel for Protein-Protein Interaction Extraction

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

DTMBIO 2012: international workshop on data and text mining in biomedical informatics

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Virtually all current PPI extraction studies focus on improving F-score, aiming to balance the performance on both precision and recall. However, in many realistic scenarios involving large corpora, one can benefit more from an extremely high precision PPI extraction tool than a high-recall counterpart. We also argue that the current "per-instance" basis performance evaluation method should be revisited. In order to address these problems, we introduce a new rule-based PPI extraction method equipped with a set of ultra-high precision extraction rules. We also propose a new "per-pair" basis performance metric, which is more pragmatic in practice. The proposed PPI extraction method achieves 95-96% per-pair and 94-97% per-instance precisions on the AIMed benchmark corpus.