Using natural language processing to identify pharmacokinetic drug-drug interactions described in drug package inserts

Authors:
Richard Boyce;Gregory Gardner;Henk Harkema
Affiliations:
University of Pittsburgh, Pittsburgh, PA;University of Pittsburgh, Pittsburgh, PA;University of Pittsburgh, Pittsburgh, PA
Venue:
BioNLP '12 Proceedings of the 2012 Workshop on Biomedical Natural Language Processing
Year:
2012

Citing 5
Cited 1

A shortest path dependency kernel for relation extraction

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Knowtator: a protégé plug-in for annotated corpus construction

NAACL-Demonstrations '06 Proceedings of the 2006 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume: demonstrations
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Discovering drug–drug interactions

Bioinformatics
Summary of Product Characteristics content extraction for a safe drugs usage

Journal of Biomedical Informatics

The DDI corpus: An annotated corpus with pharmacological substances and drug-drug interactions

Journal of Biomedical Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The package insert (aka drug product label) is the only publicly-available source of information on drug-drug interactions (DDIs) for some drugs, especially newer ones. Thus, an automated method for identifying DDIs in drug package inserts would be a potentially important complement to methods for identifying DDIs from other sources such as the scientific literature. To develop such an algorithm, we created a corpus of Federal Drug Administration approved drug package insert statements that have been manually annotated for pharmacokinetic DDIs by a pharmacist and a drug information expert. We then evaluated three different machine learning algorithms for their ability to 1) identify pharmacokinetic DDIs in the package insert corpus and 2) classify pharmacokinetic DDI statements by their modality (i.e., whether they report a DDI or no interaction between drug pairs). Experiments found that a support vector machine algorithm performed best on both tasks with an F-measure of 0.859 for pharmacokinetic DDI identification and 0.949 for modality assignment. We also found that the use of syntactic information is very helpful for addressing the problem of sentences containing both interacting and non-interacting pairs of drugs.