High confidence fragment-based classification rule mining for imbalanced HIV data

  • Authors:
  • Bing Lv;Jianyong Wang;Lizhu Zhou

  • Affiliations:
  • Database Group, Department of Computer Science and Technology, Tsinghua University, Beijing, P.R. China;Database Group, Department of Computer Science and Technology, Tsinghua University, Beijing, P.R. China;Database Group, Department of Computer Science and Technology, Tsinghua University, Beijing, P.R. China

  • Venue:
  • APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we study the problem of mining high confidence fragment-based classification rules from the imbalanced HIV data whose class distribution is extremely skewed. We propose an efficient approach to mining frequent fragments in different classes of compounds that can provide best hints of the characteristic of each class and can be used to build associative classification rules. We adopt the pattern-growth paradigm and define an efficient fragment enumeration scheme. Moreover, we introduce an improved instance-centric rule-generation strategy to mine the high-confidence fragment-based classification rules, which are very insightful and useful in differentiating one class from other classes. Experiments show that our algorithm can discover more interesting rules than the previous method and can facilitate the detection of new compounds with desired anti-HIV activity.