DCUBE: CUBE on dirty databases

  • Authors:
  • Guohua Jiang;Hongzhi Wang;Shouxu Jiang;Jianzhong Li;Hong Gao

  • Affiliations:
  • Institute of Computer Science and Technology, Harbin Institute of Technology, Harbin, China;Institute of Computer Science and Technology, Harbin Institute of Technology, Harbin, China;Institute of Computer Science and Technology, Harbin Institute of Technology, Harbin, China;Institute of Computer Science and Technology, Harbin Institute of Technology, Harbin, China;Institute of Computer Science and Technology, Harbin Institute of Technology, Harbin, China

  • Venue:
  • WAIM'10 Proceedings of the 11th international conference on Web-age information management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the real world databases, dirty data such as inconsistent data, duplicate data affect the effectiveness of applications with database. It brings new challenges to efficiently process OLAP on the database with dirty data. CUBE is an important operator for OLAP. This paper proposes the CUBE operation based on overlapping clustering, and an effective and efficient storing and computing method for CUBE on the database with dirty data. Based on CUBE, this paper proposes efficient algorithms for answering aggregation queries, and the processing methods of other major operators for OLAP on the database with dirty data. Experimental results show the efficiency of the algorithms presented in this paper.