Selected papers of the 9th annual ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
The nature of statistical learning theory
The nature of statistical learning theory
The MLPQ/GIS constraint database system
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A relational model of data for large shared data banks
Communications of the ACM
A framework for data mining and KDD
Proceedings of the 2002 ACM symposium on Applied computing
Introduction to constraint databases
Introduction to constraint databases
Foundations of Databases: The Logical Level
Foundations of Databases: The Logical Level
Database Management Systems
Machine Learning
Classes of Spatio-Temporal Objects and their Closure Properties
Annals of Mathematics and Artificial Intelligence
Efficient dynamic mining of constrained frequent sets
ACM Transactions on Database Systems (TODS)
Data integration under integrity constraints
Information Systems - Special issue: The 14th international conference on advanced information systems engineering (CAiSE*02)
Spatiotemporal reasoning about epidemiological data
Artificial Intelligence in Medicine
Methodological Review: Data integration and genomic medicine
Journal of Biomedical Informatics
Developing a labelled object-relational constraint database architecture for the projection operator
Data & Knowledge Engineering
Efficient MaxCount and threshold operators of moving objects
Geoinformatica
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Hi-index | 0.00 |
Objective: We propose classification integration as a new method for data integration from different sources. We also propose reclassification as a new method of combining existing medical classifications for different classes. Background: In many problems the raw data are already classified according to a set of features but need to be reclassified. Data reclassification is usually achieved using data integration methods that require the raw data, which may not be available or sharable because of privacy and legal concerns. Methodology: We introduce general classification integration and reclassification methods that create new classes by combining in a flexible way the existing classes without requiring access to the raw data. The flexibility is achieved by representing any linear classification in a constraint database. Results: The experiments using support vector machines and decision trees on heart disease diagnosis and primary biliary cirrhosis data show that our classification integration method is more accurate than current data integration methods when there are many missing values in the data. The reclassification problem also can be solved using constraint databases without requiring access to the raw data. Conclusions: The classification integration and the reclassification methods are applied to two particular data sets. Beside these particular cases, our general method is also appropriate for many other application areas and may yield similar accuracy improvements. These methods may be also extended to non-linear classifiers.