An improved naive Bayesian classifier technique coupled with a novel input solution method [rainfall prediction]

  • Authors:
  • J. N.K. Liu;B. N.L. Li;T. S. Dillon

  • Affiliations:
  • Dept. of Comput., Hong Kong Polytech. Univ., Hung Hom;-;-

  • Venue:
  • IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data mining is the study of how to determine underlying patterns in the data to help make optimal decisions on computers when the database involved is voluminous, hard to characterize accurately and constantly changing. It deploys techniques based on machine learning alongside more conventional methods. These techniques can generate decision or prediction models based on actual historical data. Therefore, they represent true evidence-based decision support. Rainfall prediction is a good problem to solve by data mining techniques. This paper proposes an improved naive Bayes classifier (INCB) technique and explores the use of genetic algorithms (GAs) for the selection of a subset of input features in classification problems. It then carries out a comparison with several other techniques. It compares the following algorithms on real meteorological data in Hong Kong: (1) genetic algorithms with average classification or general classification (GA-AC and GA-C), (2) C4.5 with pruning, and (3) INBC with relative frequency or initial probability density (INBC-RF and INBC-IPD). Two simple schemes are proposed to construct a suitable data set for improving their performance. Scheme I uses all the basic input parameters for rainfall prediction. Scheme II uses the optimal subset of input variables which are selected by a GA. The results show that, among the methods we compared, INBC achieved about a 90% accuracy rate on the rain/no-rain classification problems. This method also attained reasonable performance on rainfall prediction with three-level depth and five-level depth, which are around 65%-70%