Archimedes: a new model for simulating health care systems-the mathematical formulation
Journal of Biomedical Informatics
ULDBs: databases with uncertainty and lineage
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Biomedical Informatics: Computer Applications in Health Care and Biomedicine (Health Informatics)
Biomedical Informatics: Computer Applications in Health Care and Biomedicine (Health Informatics)
MCDB: a monte carlo approach to managing uncertain data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Probabilistic Relational Database Applications for Biomedical Informatics
AINAW '08 Proceedings of the 22nd International Conference on Advanced Information Networking and Applications - Workshops
Sensitivity analysis and explanations for robust query evaluation in probabilistic databases
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Editorial: Selected Papers from the 2011 Summit on Clinical Research Informatics
Journal of Biomedical Informatics
Hi-index | 0.01 |
Proposal and execution of clinical trials, computation of quality measures and discovery of correlation between medical phenomena are all applications where an accurate count of patients is needed. However, existing sources of this type of patient information, including Clinical Data Warehouses (CDWs) may be incomplete or inaccurate. This research explores applying probabilistic techniques, supported by the MayBMS probabilistic database, to obtain accurate patient counts from a Clinical Data Warehouse containing synthetic patient data. We present a synthetic Clinical Data Warehouse, and populate it with simulated data using a custom patient data generation engine. We then implement, evaluate and compare different techniques for obtaining patients counts. We model billing as a test for the presence of a condition. We compute billing's sensitivity and specificity both by conducting a ''Simulated Expert Review'' where a representative sample of records are reviewed and labeled by experts, and by obtaining the ground truth for every record. We compute the posterior probability of a patient having a condition through a ''Bayesian Chain'', using Bayes' Theorem to calculate the probability of a patient having a condition after each visit. The second method is a ''one-shot'' approach that computes the probability of a patient having a condition based on whether the patient is ever billed for the condition. Our results demonstrate the utility of probabilistic approaches, which improve on the accuracy of raw counts. In particular, the simulated review paired with a single application of Bayes' Theorem produces the best results, with an average error rate of 2.1% compared to 43.7% for the straightforward billing counts. Overall, this research demonstrates that Bayesian probabilistic approaches improve patient counts on simulated patient populations. We believe that total patient counts based on billing data are one of the many possible applications of our Bayesian framework. Use of these probabilistic techniques will enable more accurate patient counts and better results for applications requiring this metric.