Mining Classification Rules from Datasets with Large Number of Many-Valued Attributes

Authors:
Giovanni Guiffrida;Wesley W. Chu;Dominique M. Hanssens
Affiliations:
-;-;-
Venue:
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Year:
2000

Citing 14
Cited 5

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
C4.5: programs for machine learning

C4.5: programs for machine learning
Fast discovery of association rules

Advances in knowledge discovery and data mining
Levelwise Search and Borders of Theories in KnowledgeDiscovery

Data Mining and Knowledge Discovery
The CN2 Induction Algorithm

Machine Learning
SLIQ: A Fast Scalable Classifier for Data Mining

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Clustering Association Rules

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
SPRINT: A Scalable Parallel Classifier for Data Mining

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
A Scalable Bottum-Up Data Mining Algorithm for Relational Databases

SSDBM '98 Proceedings of the 10th International Conference on Scientific and Statistical Database Management
Constraint-Based Rule Mining in Large, Dense Databases

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Scalable Mining for Classification Rules in Relational Databases

IDEAS '98 Proceedings of the 1998 International Symposium on Database Engineering & Applications
Turning Datamining into a Management Science Tool: New Algorithms and Empirical Results

Management Science
Concept learning and the problem of small disjuncts

IJCAI'89 Proceedings of the 11th international joint conference on Artificial intelligence - Volume 1
Learning trees and rules with set-valued features

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Mining Sequence Patterns from Wind Tunnel Experimental Data for Flight Control

PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
Is this brand ephemeral? A multivariate tree-based decision analysis of new product sustainability

Decision Support Systems
Extending l-diversity to generalize sensitive data

Data & Knowledge Engineering
Classification based on association rules: A lattice-based approach

Expert Systems with Applications: An International Journal
CAR-Miner: An efficient algorithm for mining class-association rules

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Decision tree induction algorithms scale well to large datasets for their univariate and divide-and-conquer approach. However, they may fail in discovering effective knowledge when the input dataset consists of a large number of uncorrelated many-valued attributes. In this paper we present an algorithm, Noah, that tackles this problem by applying a multivariate search. Performing a multivariate search leads to a much larger consumption of computation time and memory, this may be prohibitive for large datasets. We remedy this problem by exploiting effective pruning strategies and efficient data structures.We applied our algorithm to a real marketing application of cross-selling. Experimental results revealed that the application database was too complex for C4.5 as it failed to discover any useful knowledge. The application database was also too large for various well known rule discovery algorithms which were not able to complete their task. The pruning techniques used in Noah are general in nature and can be used in other mining systems.