Generalization and decision tree induction: efficient classification in data mining

  • Authors:
  • M. Kamber;L. Winstone;Wan Gong;Shan Cheng;Jiawei Han

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • RIDE '97 Proceedings of the 7th International Workshop on Research Issues in Data Engineering (RIDE '97) High Performance Database Management for Large-Scale Applications
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

Efficiency and scalability are fundamental issues concerning data mining in large databases. Although classification has been studied extensively, few of the known methods take serious consideration of efficient induction in large databases and the analysis of data at multiple abstraction levels. The paper addresses the efficiency and scalability issues by proposing a data classification method which integrates attribute oriented induction, relevance analysis, and the induction of decision trees. Such an integration leads to efficient, high quality, multiple level classification of large amounts of data, the relaxation of the requirement of perfect training sets, and the elegant handling of continuous and noisy data.