Introduction to statistical pattern recognition (2nd ed.)
Introduction to statistical pattern recognition (2nd ed.)
Statistical Pattern Recognition: A Review
IEEE Transactions on Pattern Analysis and Machine Intelligence
Communications of the ACM - Supporting community and building social capital
Beyond accuracy: what data quality means to data consumers
Journal of Management Information Systems
Methodologies for data quality assessment and improvement
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
The General Practice Research Database (GPRD) is a collection of anonymised patient records obtained from UK general practices. Data are representative of approximately 8% of the UK population and are collected mainly for research purposes, which include assessing risk factors for disease, evaluating the side effects of drugs and comparing the effectiveness of different drugs. The data are used internationally by academics, governments and the pharmaceutical industry. As research findings arising from GPRD data may have potential public health and safety implications it is crucial importance that the data collected is of high quality. Data quality may vary within and between practices and may depend on the time of data collection. Although the GPRD's established framework for assessing data quality is comprehensive, it does not allow a systematic review of individual practice data quality markers. We are developing a framework for further improvement of existing methods of data quality assessment. We shall extend a set of current quality measures for each practice and, using statistical pattern recognition techniques, shall develop algorithms that will combine these measures into a smaller number of meaningful quality scores which will reflect different aspects of data quality and can be measured over time. We report the aims and rationale of the study and preliminary results.