COSINE: a vertical group difference approach to contrast set mining

Authors:
Mondelle Simeon;Robert Hilderman
Affiliations:
Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada;Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada
Venue:
Canadian AI'11 Proceedings of the 24th Canadian conference on Advances in artificial intelligence
Year:
2011

Citing 10
Cited 0

Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Detecting change in categorical data: mining contrast sets

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Detecting Group Differences: Mining Contrast Sets

Data Mining and Knowledge Discovery
An Efficient Algorithm for Mining Association Rules in Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
A symbolic representation of time series, with implications for streaming algorithms

DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Fast vertical mining using diffsets

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
GenMax: An Efficient Algorithm for Mining Maximal Frequent Itemsets

Data Mining and Knowledge Discovery
Exploratory Quantitative Contrast Set Mining: A Discretization Approach

ICTAI '07 Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
Mining negative contrast sets from data with discrete attributes

Expert Systems with Applications: An International Journal
Group SAX: extending the notion of contrast sets to time series and multimedia data

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Contrast sets have been shown to be a useful mechanism for describing differences between groups. A contrast set is a conjunction of attribute-value pairs that differ significantly in their distribution across groups. These groups are defined by a selected property that distinguishes one from the other (e.g customers who default on their mortgage versus those that don't). In this paper, we propose a new search algorithm which uses a vertical approach for mining maximal contrast sets on categorical and quantitative data. We utilize a novel yet simple discretization technique, akin to simple binning, for continuous-valued attributes. Our experiments on real datasets demonstrate that our approach is more efficient than two previously proposed algorithms, and more effective in filtering interesting contrast sets.