Improving the design of induction methods by analyzing algorithm functionality and data-based concept complexity

  • Authors:
  • Larry Rendell;Harish Ragavan

  • Affiliations:
  • Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL;Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL

  • Venue:
  • IJCAI'93 Proceedings of the 13th international joint conference on Artifical intelligence - Volume 2
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

Although empirical machine learning has seen many algorithms, one of its most important goals has been neglected. Important real-world problems often have just a primitive representation, to which the target concept bears only a remote, obscure relationship. This consideration leads to a class of measures that may be applied to data to estimate difficulty for standard algorithms. As the concept becomes harder, current decision tree and decision list methods give increasingly poor accuracy, though backpropagation does better. A new system for feature construction scales up best. The fundamental limitation of standard algorithms is caused by two problems: greedy search and representational inadequacy. Critical analysis and empirical results show that lookahead alleviates the greedy hill-climbing problem at high cost, but even this is insufficient. Combining lookahead with feature construction alleviates the "complex global replication" problem with hard concepts. For principled algorithm development and good progress, researchers need to study hard concepts and system behavior using them.