A genetic algorithm-based clustering approach for database partitioning

Authors:
Chun-Hung Cheng;Wing-Kin Lee;Kam-Fai Wong
Affiliations:
Dept. of Syst. Eng. & Eng. Manage., Chinese Univ. of Hong Kong, Shatin;-;-
Venue:
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Year:
2002

Citing 0
Cited 7

A high-performance computing method for data allocation in distributed database systems

The Journal of Supercomputing
An examination of cluster identification-based algorithms for vertical partitions

International Journal of Business Information Systems
A multi-inner-world genetic algorithm using multiple heuristics to optimize delivery schedule

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Cost-based fragmentation for distributed complex value databases

ER'07 Proceedings of the 26th international conference on Conceptual modeling
An overall-regional competitive self-organizing map neural network for the Euclidean traveling salesman problem

Neurocomputing
Particle swarm optimisation for data warehouse logical design

International Journal of Bio-Inspired Computation
Performance optimality enhancement algorithm in DDBS (POEA)

Computers in Human Behavior

Quantified Score

Hi-index	0.00

Visualization

Abstract

In a typical distributed/parallel database system, a request mostly accesses a subset of the entire database. It is, therefore, natural to organize commonly accessed data together and to place them on nearby, preferably the same, machine(s)/site(s). For this reason, data partitioning and data allocation are performance critical issues in distributed database application design. We are dealing with data partitioning. Data partitioning requires the use of clustering. Although many clustering algorithms have been proposed, their performance has not been extensively studied. Moreover, the special problem structure in clustering is rarely exploited. We explore the use of a genetic search-based clustering algorithm for data partitioning to achieve high database retrieval performance. By formulating the underlying problem as a traveling salesman problem (TSP), we can take advantage of this particular structure. Three new operators for GAs are also proposed and experimental results indicate that they outperform other operators in solving the TSP. The proposed GA is applied to solve the data-partitioning problem. Our computational study shows that our GA performs well for this application.