A new mining approach for uncertain databases using CUFP trees

  • Authors:
  • Chun-Wei Lin;Tzung-Pei Hong

  • Affiliations:
  • Department of Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung 811, Taiwan, ROC;Department of Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung 811, Taiwan, ROC and Department of Computer Science and Engineering, National Sun Yat-sen Un ...

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2012

Quantified Score

Hi-index 12.05

Visualization

Abstract

In the past, many algorithms have been proposed to mine frequent itemsets from transactional databases, in which the presence or absence of items in transactions was certainly known. In some applications, items may also be uncertain in transactions with their existential probabilities ranging from 0 to 1 in the uncertain dataset. Apparently, the processing in uncertain datasets is quite different from those in certain datasets. The UF-tree algorithm was proposed to construct the UF-tree structure from an uncertain dataset and mine frequent itemsets from the tree. In the UF-tree construction process, however, only the same items with the same existential probabilities in transactions were merged together in the tree, thus causing many redundant nodes in the tree. In this paper, a new tree structure called the compressed uncertain frequent-pattern tree (CUFP tree) is designed to efficiently keep the related information in the mining process. In the CUFP tree, the same items will be merged in a branch of the tree even when the existential probabilities in transactions are not the same. A mining algorithm called the CUFP-mine algorithm is then proposed based on the tree structure to find uncertain frequent patterns. Experimental results show that the proposed approach has a better performance than UF-tree algorithm both in the execution time and in the number of tree nodes.