Computational geometry: an introduction
Computational geometry: an introduction
STOC '86 Proceedings of the eighteenth annual ACM symposium on Theory of computing
Learnability and the Vapnik-Chervonenkis dimension
Journal of the ACM (JACM)
C4.5: programs for machine learning
C4.5: programs for machine learning
&egr;-approximations of k-label spaces
Theoretical Computer Science - Special issue on algorithmic learning theory
Communications of the ACM
Machine Learning
Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Mining optimized association rules for numeric attributes
PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Mining optimized association rules for numeric attributes
Journal of Computer and System Sciences
SLIQ: A Fast Scalable Classifier for Data Mining
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
RainForest - A Framework for Fast Decision Tree Construction of Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
SPRINT: A Scalable Parallel Classifier for Data Mining
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Constructing Efficient Decision Trees by Using Optimized Numeric Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Efficient Construction of Regression Trees with Range and Region Splitting
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Interval Finding and Its Application to Data Mining
ISAAC '96 Proceedings of the 7th International Symposium on Algorithms and Computation
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Moving towards efficient decision tree construction
Information Sciences: an International Journal
A new node splitting measure for decision tree construction
Pattern Recognition
Hi-index | 0.00 |
We consider the problem of finding a set of attribute values that give a high quality binary segmentation of a database. The quality of a segmentation is defined by an objective function suitable for the user's objective, such as "mean squared error," "mutual information," or "\chi^2," each of which is defined in terms of the distribution of a given target attribute. Our goal is to find value groups on a given conditional domain that split databases into two segments, optimizing the value of an objective function. Though the problem is intractable for general objective functions, there are feasible algorithms for finding high quality binary segmentations when the objective function is convex, and we prove that the typical criteria mentioned above are all convex. We propose two practical algorithms, based on computational geometry techniques, which find a much better value group than conventional heuristics.