Identifying Fewer Key Factors by Attribute Selection Methodologies to Understand the Hospital Admission Prediction Pattern with Ant Miner and C4.5

  • Authors:
  • Kyoko Fukuda

  • Affiliations:
  • Geo Health Lab, Department of Geography, University of Canterbury, Christchurch, New Zealand 4800

  • Venue:
  • KES '09 Proceedings of the 13th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems: Part II
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Attribute Selection (AS) is generally applied as a data pre-processing step to sufficiently reduce the number of attributes in a dataset. This study uses six different data mining AS methods to identify a few key driving climate and air pollution attributes from small attribute sets (16 attributes) to increase knowledge about the underlying structures of acute respiratory hospital admission counts, because understanding key factors in environmental science data helps constructing a cost effective data collection and management process by focusing on collecting and investigating more representative and important variables. The performance of the selected attribute set was tested with Ant-Miner and C4.5 classifiers to examine the ability to prediction the admission count. Removal of attributes was successful over all AS methods, especially TNSU (a newly developed AS method, Tree Node Selection for unpruned), which achieved best in removing attributes and some improving the classification accuracy for Ant-Miner and C4.5. However, the overall prediction accuracy improvements are small, suggesting that AS selects attribute sets sufficiently enough to maintain the accuracy for Ant-Miner and C4.5.