Vector-based approach to analysis of file space properties

  • Authors:
  • I. A. R. Moghrabi;R. Makholian

  • Affiliations:
  • Natural Science Division, Lebanese American University, P.O. Box 13-5053, Beirut, LEBANON;Natural Science Division, Lebanese American University, P.O. Box 13-5053, Beirut, LEBANON

  • Venue:
  • Progress in computer research
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

This work introduces a new approach to vector-model clustering where a hybrid algorithm is presented to cluster records based on a prescribed threshold value while taking into account the query patterns in a given database. The Hamming Distance of a file is used as a 'cheap' measure of space density. The objective of the algorithm is to minimize response time of a retrieval system by partly maximizing the space density of the file and ensuring that popular tuples remain in physical proximity in the file space. Simulation experiments conducted proved that a great reduction in response time is yielded after the restructuring of a file. Criteria, such as, block size, threshold value, percentage of records satisfying a given set of queries, etc., which affect clustering and response time are studied using statistical analysis.