Identifying appropriate methodologies and strategies for vertical mining with incomplete data

  • Authors:
  • Faris Alqadah;Zhen Hu;Lawrence J. Mazlack

  • Affiliations:
  • Applied Computational Intelligence Laboratory, University of Cincinnati;Applied Computational Intelligence Laboratory, University of Cincinnati;Applied Computational Intelligence Laboratory, University of Cincinnati

  • Venue:
  • WSEAS Transactions on Computers
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many data mining methods are dependent on recognizing frequent patterns. Frequent patterns lead to the discovery of association rules, strong rules, sequential episodes, and multi-dimensional patterns. All can play a critical role in helping corporate and scientific institutions to understand and analyze their data. Patterns should be discovered in time and space efficient manner. Discovered patterns have authentic value when they accurately describe data trends; and, do not exclusively reflect noise or chance encounters. Vertical data mining algorithms key advantage is that they can outperform their horizontal counterparts in terms of both time and space efficiency. Little work has addressed how incomplete data influences vertical data mining. Consequently, the quality and utility of vertical mining algorithms results remains ambiguous as real data sets often contain incomplete data. This paper considers how to establish methodologies that deal with incomplete data in vertical mining; additionally, it seeks to develop strategies for determining the maximal utilization that can be mined from a dataset based on how much and what data is missing.