Mining contrast inequalities in numeric dataset

  • Authors:
  • Lei Duan;Jie Zuo;Tianqing Zhang;Jing Peng;Jie Gong

  • Affiliations:
  • School of Computer Science, Sichuan University, Chengdu, China;School of Computer Science, Sichuan University, Chengdu, China;School of Computer Science, Sichuan University, Chengdu, China;Sci. & Tech. Department, Chengdu Municipal Public Security Bureau, Chengdu, China;School of Computer Science, Sichuan University, Chengdu, China

  • Venue:
  • WAIM'10 Proceedings of the 11th international conference on Web-age information management
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

Finding relational expressions which exist frequently in one class of data while not in the other class of data is an interesting work. In this paper, a relational expression of this kind is defined as a contrast inequality. Gene Expression Programming (GEP) is powerful to discover relations from data and express them in mathematical level. Hence, it is desirable to apply GEP to such mining task. The main contributions of this paper include: (1) introducing the concept of contrast inequality mining, (2) designing a two-genome chromosome structure to guarantee that each individual in GEP is a valid inequality, (3) proposing a new genetic mutation to improve the efficiency of evolving contrast inequalities, (4) presenting a GEP-based method to discover contrast inequalities, (5) giving an extensive performance study on real-world datasets. The experimental results show that the proposed methods are effective. Contrast inequalities with high discriminative power are discovered from the real-world datasets. Some potential works on contrast inequality mining are discussed.