C4.5: programs for machine learning
C4.5: programs for machine learning
Machine learning, neural and statistical classification
Machine learning, neural and statistical classification
Rough Sets: Theoretical Aspects of Reasoning about Data
Rough Sets: Theoretical Aspects of Reasoning about Data
Rough Sets in Knowledge Discovery 2: Applications, Case Studies, and Software Systems
Rough Sets in Knowledge Discovery 2: Applications, Case Studies, and Software Systems
Processing large-scale multi-dimensional data in parallel and distributed environments
Parallel Computing - Parallel data-intensive algorithms and applications
SLIQ: A Fast Scalable Classifier for Data Mining
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
SPRINT: A Scalable Parallel Classifier for Data Mining
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Efficient SQL-Querying Method for Data Mining in Large Data Bases
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
On Efficient Handling of Continuous Attributes in Large Data Bases
Fundamenta Informaticae
Hi-index | 0.00 |
We present an efficient method for decision tree construction from large data set, which is assumed to be stored in some database server, and to be accessible by SQL queries. We develop a decision tree construction method, which minimizes the total time of data transmission between client and server. Our method, based on divide and conqurer search strategy, minimizes the number of simple queries necessary to search for the best cuts. To make it possible, we develop some, approximate measures, defined on intervals of attribute values, to evaluate the chance that the best cut belongs to the given interval. We propose some applications of the presented approach in discretization and construction of soft decision tree, which is a novel classifier model.