Rough Sets: Theoretical Aspects of Reasoning about Data
Rough Sets: Theoretical Aspects of Reasoning about Data
Genetic Algorithms in Search, Optimization and Machine Learning
Genetic Algorithms in Search, Optimization and Machine Learning
Rough sets perspective on data and knowledge
Handbook of data mining and knowledge discovery
Applying rough set theory to multi stage medical diagnosing
Fundamenta Informaticae
Neuro-fuzzy modeling and fuzzy rule extraction applied to conflict management
ICONIP'06 Proceedings of the 13th international conference on Neural information processing - Volume Part III
Extending particle swarm optimisation via genetic programming
EuroGP'05 Proceedings of the 8th European conference on Genetic Programming
Application of Simulated Annealing to the Biclustering of Gene Expression Data
IEEE Transactions on Information Technology in Biomedicine
Hi-index | 0.00 |
Rough set theory (RST) is concerned with the formal approximation of crisp sets and is a mathematical tool which deals with vagueness and uncertainty. This paper presents an approach to optimize rough set partition sizes using various optimization techniques. The forecasting accuracy is measured by using the area under the curve (AUC) of the receiver operating characteristic (ROC) curve. The four optimization techniques used are genetic algorithm, particle swarm optimization, hill climbing and simulated annealing. This proposed method is tested on two data sets, namely, the human immunodeficiency virus (HIV) data set and the militarized interstate dispute (MID) data set. The results obtained from this granulization method are compared to two previous static granulization methods, namely, equal-width-bin and equal-frequency-bin partitioning. The results conclude that all of the proposed optimized methods produce higher forecasting accuracies than that of the two static methods. In the case of the HIV data set, the hill climbing approach produced the highest accuracy; an accuracy of 69.02% is achieved in a time of 210.4 hours. For the MID data, the genetic algorithm approach produced the highest accuracy. The accuracy achieved is 95.82% in a time of 7 hours. The rules generated from the rough set are linguistic and easy-to-interpret, but this does come at the expense of the accuracy lost in the discretization process where the granularity of the variables is decreased.