Towards effective sentence simplification for automatic processing of biomedical text

Authors:
Siddhartha Jonnalagadda;Luis Tari;Jörg Hakenberg;Chitta Baral;Graciela Gonzalez
Affiliations:
Arizona State University, Phoenix, AZ;Arizona State University, Tempe, AZ;Arizona State University, Tempe, AZ;Arizona State University, Tempe, AZ;Arizona State University, Phoenix, AZ
Venue:
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Year:
2009

Citing 8
Cited 4

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Constraint based integration of deep and shallow parsing techniques

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
An integrated architecture for shallow and deep processing

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Integrated shallow and deep parsing: TopP meets HPSG

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
User-sensitive text summarization: application to the medical domain

User-sensitive text summarization: application to the medical domain
Finding the evidence for protein-protein interactions from PubMed abstracts

Bioinformatics
Self-training for biomedical parsing

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
On the unification of syntactic annotations under the stanford dependency scheme: a case study on BioInfer and GENIA

BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing

Putting it simply: a context-aware approach to lexical simplification

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Simple English Wikipedia: a new text simplification task

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
A semantic graph-based approach to biomedical summarisation

Artificial Intelligence in Medicine
RankPref: ranking sentences describing relations between biomedical entities with an application

BioNLP '12 Proceedings of the 2012 Workshop on Biomedical Natural Language Processing

Quantified Score

Hi-index	0.01

Visualization

Abstract

The complexity of sentences characteristic to biomedical articles poses a challenge to natural language parsers, which are typically trained on large-scale corpora of non-technical text. We propose a text simplification process, bioSimplify, that seeks to reduce the complexity of sentences in biomedical abstracts in order to improve the performance of syntactic parsers on the processed sentences. Syntactic parsing is typically one of the first steps in a text mining pipeline. Thus, any improvement in performance would have a ripple effect over all processing steps. We evaluated our method using a corpus of biomedical sentences annotated with syntactic links. Our empirical results show an improvement of 2.90% for the Charniak-McClosky parser and of 4.23% for the Link Grammar parser when processing simplified sentences rather than the original sentences in the corpus.