The use of cluster analysis in physical data base design

  • Authors:
  • Jeffrey A. Hoffer;Dennis G. Severance

  • Affiliations:
  • Case Western Reserve University, Cleveland, Ohio;University of Minnesota, Minneapolis, Minnesota

  • Venue:
  • VLDB '75 Proceedings of the 1st International Conference on Very Large Data Bases
  • Year:
  • 1975

Quantified Score

Hi-index 0.00

Visualization

Abstract

The physical structure and relative placement of information elements within a data base is critical for the efficient design of a computerized information system which is shared by a community of users. Traditionally the selection among alternative structural designs has been handled largely via heuristics. Recent research has shown that a number of significant design problems can be stated mathematically as nonlinear, integer, zero-one programming problems. In concept, therefore, mathematical programming algorithms can be used to determine "optimal" data base designs. In practice, one finds that realistic problems of even modest size are computationally infeasible. This paper presents a means for overcoming this difficulty in the design of data base records. A metric with which to measure the similarity of usage among data items is developed and used by a clustering algorithm to reduce the space of alternative designs to a point where solution is economically feasible.