On the Complexity of Optimal Multisplitting

  • Authors:
  • Tapio Elomaa;Juho Rousu

  • Affiliations:
  • -;-

  • Venue:
  • ISMIS '00 Proceedings of the 12th International Symposium on Foundations of Intelligent Systems
  • Year:
  • 2000

Quantified Score

Hi-index 0.01

Visualization

Abstract

Dynamic programming has been studied extensively, e.g., in computational geometry and string matching. It has recently found a new application in the optimal multisplitting of numerical attribute value domains.We reflect the results obtained earlier to this problem and study whether they help to shed a new light on the inherent complexity of this time-critical subtask of machine learning and data mining programs. The concept of monotonicity has come up in earlier research. It helps to explain the different asymptotic time requirements of optimal multisplitting with respect to different attribute evaluation functions. As case studies we examine Training Set Error and Average Class Entropy functions. The former has a linear-time optimization algorithm, while the latter--like most well-known attribute evaluation functions--takes a quadratic time to optimize. It is shown that neither of them fulfills the strict monotonicity condition, but computing optimal Training Set Error values can be decomposed into monotone subproblems.