Computing Optimal Hypotheses Efficiently for Boosting

Authors:
Shinichi Morishita
Affiliations:
-
Venue:
Progress in Discovery Science, Final Report of the Japanese Discovery Science Project
Year:
2002

Citing 13
Cited 6

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Boosting a weak learning algorithm by majority

Information and Computation
Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Mining optimized gain rules for numeric attributes

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
On approximating rectangle tiling and packing

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Algorithms for the maximum subarray problem based on matrix multiplication

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Transversing itemset lattices with statistical metric pruning

PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Programming pearls: algorithm design techniques

Communications of the ACM
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
On Classification and Regression

DS '98 Proceedings of the First International Conference on Discovery Science
The strength of weak learnability

SFCS '89 Proceedings of the 30th Annual Symposium on Foundations of Computer Science

KDD Cup 2001 report

ACM SIGKDD Explorations Newsletter
Partial least squares regression for graph mining

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
gBoost: a mathematical programming approach to graph classification and regression

Machine Learning
A fast boosting-based learner for feature-rich tagging and chunking

CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
Subtree mining for question classification problem

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Improving subtree-based question classification classifiers with word-cluster models

NLDB'11 Proceedings of the 16th international conference on Natural language processing and information systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper sheds light on a strong connection between AdaBoost and several optimization algorithms for data mining. AdaBoost has been the subject of much interests as an effective methodology for classification task. AdaBoost repeatedly generates one hypothesis in each round, and finally it is able to make a highly accurate prediction by taking a weighted majority vote on the resulting hypotheses. Freund and Schapire have remarked that the use of simple hypotheses such as single-test decision trees instead of huge trees would be promising for achieving high accuracy and avoiding overfitting to the training data. One major drawback of this approach however is that accuracies of simple individual hypotheses may not always be high, hence demanding a way of computing more accurate (or, the most accurate) simple hypotheses efficiently. In this paper, we consider several classes of simple but expressive hypotheses such as ranges and regions for numeric attributes, subsets of categorical values, and conjunctions of Boolean tests. For each class, we develop an efficient algorithm for choosing the optimal hypothesis.