C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning
A Survey of Methods for Scaling Up Inductive Algorithms
Data Mining and Knowledge Discovery
Techniques for Dealing with Missing Values in Classification
IDA '97 Proceedings of the Second International Symposium on Advances in Intelligent Data Analysis, Reasoning about Data
Learning classifiers from distributed, semantically heterogeneous, autonomous data sources
Learning classifiers from distributed, semantically heterogeneous, autonomous data sources
Hierarchical Decision Tree Induction in Distributed Genomic Databases
IEEE Transactions on Knowledge and Data Engineering
Data Mining
Cached sufficient statistics for efficient machine learning with large datasets
Journal of Artificial Intelligence Research
Data mining using relational database management systems
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
Learning relational bayesian classifiers from RDF data
ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
Hi-index | 0.00 |
We describe an approach to learning predictive models from large databases in settings where direct access to data is not available because of massive size of data, access restrictions, or bandwidth requirements. We outline some techniques for minimizing the number of statistical queries needed; and for efficiently coping with missing values in the data. We provide open source implementation of the decision tree and Naive bayes algorithms to demonstrate the feasibility of the proposed approach.