Empirical Bayesian data mining for discovering patterns in post-marketing drug safety

  • Authors:
  • David M. Fram;June S. Almenoff;William DuMouchel

  • Affiliations:
  • Lincoln Technologies, Inc., Wellesley Hills, MA;GlaxoSmithKline, Research Triangle Park, NC;AT&T Shannon Laboratory, Florham Park, NJ

  • Venue:
  • Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2003

Quantified Score

Hi-index 0.01

Visualization

Abstract

Because of practical limits in characterizing the safety profiles of therapeutic products prior to marketing, manufacturers and regulatory agencies perform post-marketing surveillance based on the collection of adverse reaction reports ("pharmacovigilance").The resulting databases, while rich in real-world information, are notoriously difficult to analyze using traditional techniques. Each report may involve multiple medicines, symptoms, and demographic factors, and there is no easily linked information on drug exposure in the reporting population. KDD techniques, such as association finding, are well-matched to the problem, but are difficult for medical staff to apply and interpret.To deploy KDD effectively for pharmacovigilance, Lincoln Technologies and GlaxoSmithKline collaborated to create a webbased safety data mining web environment. The analytical core is a high-performance implementation of the MGPS (Multi-Item Gamma Poisson Shrinker) algorithm described previously by DuMouchel and Pregibon, with several significant extensions and enhancements. The environment offers an interface for specifying data mining runs, a batch execution facility, tabular and graphical methods for exploring associations, and drilldown to case details. Substantial work was involved in preparing the raw adverse event data for mining, including harmonization of drug names and removal of duplicate reports.The environment can be used to explore both drug-event and multi-way associations (interactions, syndromes). It has been used to study age/gender effects, to predict the safety profiles of proposed combination drugs, and to separate contributions of individual drugs to safety problems in polytherapy situations.