New Techniques for Data Reduction in a Database System for Knowledge Discovery Applications

Authors:
Akhil Kumar
Affiliations:
College of Business, Campus Box 419, University of Colorado, Boulder, CO 80309-0419. E-mail: akhil.kumar@colorado.edu
Venue:
Journal of Intelligent Information Systems
Year:
1998

Citing 12
Cited 5

Learning internal representations by error propagation

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Fuzzy rough sets: application to feature selection

Fuzzy Sets and Systems
C4.5: programs for machine learning

C4.5: programs for machine learning
Advances in knowledge discovery and data mining

Advances in knowledge discovery and data mining
Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory

Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory
Database System Concepts

Database System Concepts
Knowledge Discovery in Databases

Knowledge Discovery in Databases
Learning Classification Rules from Database in the Context of Knowledge Acquisition and Representation

IEEE Transactions on Knowledge and Data Engineering
An Empirical Comparison of Pruning Methods for Decision Tree Induction

Machine Learning
Induction of Decision Trees

Machine Learning
A Rough Set Framework for Data Mining of Propositional Default Rules

ISMIS '96 Proceedings of the 9th International Symposium on Foundations of Intelligent Systems
Handling Various Types of Uncertainty in the Rough Set Approach

RSKD '93 Proceedings of the International Workshop on Rough Sets and Knowledge Discovery: Rough Sets, Fuzzy Sets and Knowledge Discovery

A new rough sets model based on database systems

Fundamenta Informaticae - Special issue on the 9th international conference on rough sets, fuzzy sets, data mining and granular computing (RSFDGrC 2003)
A new rough sets model based on database systems

RSFDGrC'03 Proceedings of the 9th international conference on Rough sets, fuzzy sets, data mining, and granular computing
Generate (F, ε)-dynamic reduct using cascading hashes

RSKT'10 Proceedings of the 5th international conference on Rough set and knowledge technology
A New Rough Sets Model Based on Database Systems

Fundamenta Informaticae - The 9th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Conputing (RSFDGrC 2003)
A novel feature selection method and its application

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Databases store large amounts of information about consumertransactions and other kinds of transactions. This information canbe used to deduce rules about consumer behavior, and the rules can inturn be used to determine company policies, for instance with regardsto production, marketing and in several other areas. Since databasestypically store millions of records, and each record could have up to100 or more attributes, as an initial step it is necessary to reducethe size of the database by eliminating attributes that do notinfluence the decision at all or do so very minimally. In this paperwe present techniques that can be employed effectively for exact andapproximate reduction in a database system. These techniques can beimplemented efficiently in a database system using SQL (structuredquery language) commands. We tested their performance on a real dataset and validated them. The results showed that the classificationperformance actually improved with a reduced set of attributes ascompared to the case when all the attributes were present. We alsodiscuss how our techniques differ from statistical methods and otherdata reduction methods such as rough sets.