Fuzzy Statistics Estimation in Supporting Multidatabase Query Optimization

  • Authors:
  • Chih-Ping Wei;Olivia R. Liu Sheng;Paul Jen-Hwa Hu

  • Affiliations:
  • Department of Information Management, College of Management, National Sun Yat-Sen University, Kaohsiung, Taiwan, R.O.C. cwei@mis.nsysu.edu.tw;Department of Management Information Systems, College of Business and Public Administration, University of Arizona, Tucson, Arizona 85721, U.S.A. olivia@bpa.arizona.edu;Department of Accounting and Information Systems, David Eccles School of Business, University of Utah, Salt Lake City, Utah 84112, U.S.A. actph@business.utah.edu

  • Venue:
  • Electronic Commerce Research
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Advances in networking and database technology have made global information sharing a reality. Multidatabase systems (MDBSs) represent a promising approach to addressing the challenges of achieving interoperability among multiple pre-existing databases that are highly autonomous and possibly heterogeneous. The performance of an MDBS is greatly dependent on effectiveness of multidatabase query optimization (MQO). However, the unavailability of and uncertainty in the statistics essential to query optimization have made multidatabase query optimization (MQO) significantly more challenging than distributed query optimization. This research undertook to develop a fuzzy statistics-based MQO approach to addressing statistics estimation and uncertainty problems in an MDBS environment. We analyzed the statistics needed in an MDBS environment and classified them into three categories: point-based, distribution-function-based and dependency-based. Fuzzy numbers were adopted to represent point-based statistics, and a fuzzy polynomial regression method was developed for estimating distribution function-based statistics (i.e., attribute or join selectivity) from a set of subquery results. For dependency-based statistics, a fuzzy regression method was employed for estimating logical-parameter-based local cost functions. Furthermore, methods for ranking the fuzzy numbers that are fundamental to fuzzy-statistics-based MQO were also discussed. The proposed fuzzy statistics estimation methods were illustrated using examples to demonstrate its applicability in supporting MQO.