Patterns from multiresolution 0-1 data

  • Authors:
  • Prem Raj Adhikari;Jaakko Hollmén

  • Affiliations:
  • Aalto University School of Science and Technology, Aalto;Aalto University School of Science and Technology, Aalto

  • Venue:
  • Proceedings of the ACM SIGKDD Workshop on Useful Patterns
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Biological systems are complex systems and often the biological data is available in different resolutions. Computational algorithms are often designed to work with only specific resolution of data. Hence, upsampling or downsampling is necessary before the data can be fed to the algorithm. Moreover, high-resolution data incorporates significant amount of noise thus producing explosion of redundant patterns such as maximal frequent itemset, closed frequent itemset and non-derivable itemset in the data which can be solved by downsampling the data if the information loss is insignificant during sampling. Furthermore, comparing the results of an algorithm on data in different resolution can produce interesting results which aids in determining suitable resolution of data. In addition, experiments in different resolutions can be helpful in determining the appropriate resolution for computational methods. In this paper, three methods of downsampling are proposed, implemented and experiments are performed on different resolutions and the suitability of the proposed methods are validated and the results compared. Mixture models are trained on the data and the results are analyzed and it was seen that the proposed methods produce plausible results showing that the significant patterns in the data are retained in lower resolution. The proposed methods can be extensively used in integration of databases.