C4.5: programs for machine learning
C4.5: programs for machine learning
From optimal hyperplanes to optimal decision trees
Fundamenta Informaticae
Rough Sets: Theoretical Aspects of Reasoning about Data
Rough Sets: Theoretical Aspects of Reasoning about Data
Rough Sets in Knowledge Discovery 2: Applications, Case Studies, and Software Systems
Rough Sets in Knowledge Discovery 2: Applications, Case Studies, and Software Systems
RainForest - A Framework for Fast Decision Tree Construction of Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Efficient SQL-querying method for data mining in large data bases
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
The attribute selection problem in decision tree generation
AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
A tool for study of optimal decision trees
RSKT'10 Proceedings of the 5th international conference on Rough set and knowledge technology
On Efficient Handling of Continuous Attributes in Large Data Bases
Fundamenta Informaticae
Hi-index | 0.00 |
The main taskin decision tree construction algorithms is to find the "best partition" of the set of objects. In this paper, we investigate the problem of optimal binary partition of continuous attribute for large data sets stored in relational databases. The critical for time complexity of algorithms solving this problem is the number of simple SQL queries necessary to construct such partitions. The straightforward approach to optimal partition selection needs at least O(N) queries, where N is the number of pre-assumed partitions of the searching space. We show some properties of optimization measures related to discernibility between objects, that allow to construct the partition very close to optimal using only O(log N) simple queries.