Beyond genes, proteins, and abstracts: Identifying scientific claims from full-text biomedical articles

Authors:
Catherine Blake
Affiliations:
School of Library and Information Science, University of Illinois at Urbana Champaign, Champaign, IL 61820-3302, USA
Venue:
Journal of Biomedical Informatics
Year:
2010

Citing 16
Cited 4

Summarizing scientific articles: experiments with relevance and rhetorical status

Computational Linguistics - Summarization
COATIS, an NLP System to Locate Expressions of Actions Connected by Causality Links

EKAW '97 Proceedings of the 10th European Workshop on Knowledge Acquisition, Modeling and Management
Text Mining for Causal Relations

Proceedings of the Fifteenth International Florida Artificial Intelligence Research Society Conference
Extracting molecular binding relationships from biomedical text

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text

Journal of Biomedical Informatics - Special issue: Unified medical language system
Accurate unlexicalized parsing

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Extracting causal knowledge from a medical database using graphical patterns

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Discovering patterns to extract protein--protein interactions from full texts

Bioinformatics
Extraction of regulatory gene/protein networks from Medline

Bioinformatics
Collaborative information synthesis I: A model of information behaviors of scientists in medicine and public health

Journal of the American Society for Information Science and Technology
Collaborative information synthesis II: Recommendations for information systems to support synthesis activities

Journal of the American Society for Information Science and Technology
Classifying semantic relations in bioscience texts

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Multi-way relation classification: application to protein-protein interactions

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
RelEx---Relation extraction using dependency parse trees

Bioinformatics
Methodological Review: Extracting interactions between proteins from the literature

Journal of Biomedical Informatics
Incremental cue phrase learning and bootstrapping method for causality extraction using cue phrase and word pair probabilities

Information Processing and Management: an International Journal

Identifying comparative claim sentences in full-text scientific articles

ACL '12 Proceedings of the Workshop on Detecting Structure in Scholarly Discourse
Identifying claimed knowledge updates in biomedical research articles

ACL '12 Proceedings of the Workshop on Detecting Structure in Scholarly Discourse
A three-way perspective on scientific discourse annotation for knowledge extraction

ACL '12 Proceedings of the Workshop on Detecting Structure in Scholarly Discourse
Applying a generic function-based topical relevance typology to structure clinical questions and answers

Journal of the American Society for Information Science and Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Massive increases in electronically available text have spurred a variety of natural language processing methods to automatically identify relationships from text; however, existing annotated collections comprise only bioinformatics (gene-protein) or clinical informatics (treatment-disease) relationships. This paper introduces the Claim Framework that reflects how authors across biomedical spectrum communicate findings in empirical studies. The Framework captures different levels of evidence by differentiating between explicit and implicit claims, and by capturing under-specified claims such as correlations, comparisons, and observations. The results from 29 full-text articles show that authors report fewer than 7.84% of scientific claims in an abstract, thus revealing the urgent need for text mining systems to consider the full-text of an article rather than just the abstract. The results also show that authors typically report explicit claims (77.12%) rather than an observations (9.23%), correlations (5.39%), comparisons (5.11%) or implicit claims (2.7%). Informed by the initial manual annotations, we introduce an automated approach that uses syntax and semantics to identify explicit claims automatically and measure the degree to which each feature contributes to the overall precision and recall. Results show that a combination of semantics and syntax is required to achieve the best system performance.