C4.5: programs for machine learning
C4.5: programs for machine learning
Machine learning, neural and statistical classification
Machine learning, neural and statistical classification
Rough Sets: Theoretical Aspects of Reasoning about Data
Rough Sets: Theoretical Aspects of Reasoning about Data
Rough Sets in Knowledge Discovery 2: Applications, Case Studies, and Software Systems
Rough Sets in Knowledge Discovery 2: Applications, Case Studies, and Software Systems
Processing large-scale multi-dimensional data in parallel and distributed environments
Parallel Computing - Parallel data-intensive algorithms and applications
SLIQ: A Fast Scalable Classifier for Data Mining
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
SPRINT: A Scalable Parallel Classifier for Data Mining
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Efficient SQL-Querying Method for Data Mining in Large Data Bases
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
IEEE Transactions on Knowledge and Data Engineering
On Efficient Handling of Continuous Attributes in Large Data Bases
Fundamenta Informaticae
Hierarchical Rough Classifiers
RSEISP '07 Proceedings of the international conference on Rough Sets and Intelligent Systems Paradigms
Towards approximate SQL: infobright's approach
RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
Machine learning methods in character recognition
RSFDGrC'11 Proceedings of the 13th international conference on Rough sets, fuzzy sets, data mining and granular computing
Approximate boolean reasoning: foundations and applications in data mining
Transactions on Rough Sets V
Hi-index | 0.00 |
We present an efficient method for decision tree construction from large data sets, which are assumed to be stored in database servers, and be accessible by SQL queries. The proposed method minimizes the number of simple queries necessary to search for the best splits (cut points) by employing "divide and conquer" search strategy. To make it possible, we develop some novel evaluation measures which are defined on intervals of attribute domains. Proposed measures are necessary to estimate the quality of the best cut in a given interval. We propose some applications of the presented approach in discretization and construction of "soft decision tree", which is a novel classifier model.