RankPref: ranking sentences describing relations between biomedical entities with an application

Authors:
Catalina O. Tudor;K. Vijay-Shanker
Affiliations:
University of Delaware, Newark, DE;University of Delaware, Newark, DE
Venue:
BioNLP '12 Proceedings of the 2012 Workshop on Biomedical Natural Language Processing
Year:
2012

Citing 16
Cited 0

Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
An efficient boosting algorithm for combining preferences

The Journal of Machine Learning Research
Motivations and methods for text simplification

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Query chains: learning to rank from implicit feedback

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Generating gene summaries from biomedical literature: A study of semi-structured summarization

Information Processing and Management: an International Journal
e-LiSe—an online tool for finding needles in the ‘(Medline) haystack’

Bioinformatics
MedEvi

Bioinformatics
An empirical study of gene synonym query expansion in biomedical information retrieval

Information Retrieval
Towards automatic generation of gene summary

BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Towards effective sentence simplification for automatic processing of biomedical text

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Learning to order things

Journal of Artificial Intelligence Research
Open information extraction from the web

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Here or there: preference judgments for relevance

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Text mining techniques for leveraging positively labeled data

BioNLP '11 Proceedings of BioNLP 2011 Workshop
Developing a robust part-of-speech tagger for biomedical text

PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
Extraction of data deposition statements from the literature

Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a machine learning approach that selects and, more generally, ranks sentences containing clear relations between genes and terms that are related to them. This is treated as a binary classification task, where preference judgments are used to learn how to choose a sentence from a pair of sentences. Features to capture how the relationship is described textually, as well as how central the relationship is in the sentence, are used in the learning process. Simplification of complex sentences into simple structures is also applied for the extraction of the features. We show that such simplification improves the results by up to 13%. We conducted three different evaluations and we found that the system significantly outperforms the baselines.