Constraining and summarizing association rules in medical data

Authors:
Carlos Ordonez;Norberto Ezquerra;Cesar A. Santana
Affiliations:
Teradata, NCR, San Diego, CA;Georgia Institute of Technology, Atlanta, GA;Emory University Hospital, GA
Venue:
Knowledge and Information Systems
Year:
2006

Citing 32
Cited 17

Medical diagnosis using a probabilistic causal network

Applied Artificial Intelligence
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Finding interesting rules from large sets of discovered association rules

CIKM '94 Proceedings of the third international conference on Information and knowledge management
Mining quantitative association rules in large relational tables

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Background for association rules and cost estimate of selected mining algorithms

CIKM '96 Proceedings of the fifth international conference on Information and knowledge management
Dynamic itemset counting and implication rules for market basket data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Exploratory mining and pruning optimizations of constrained associations rules

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Optimization of constrained frequent set queries with 2-variable constraints

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Mining the most interesting rules

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Understanding the crucial differences between classification and discovery of association rules: a position paper

ACM SIGKDD Explorations Newsletter
Constrained frequent pattern mining: a pattern-growth view

ACM SIGKDD Explorations Newsletter
Pincer Search: A New Algorithm for Discovering the Maximum Frequent Set

EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Clustering Association Rules

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Mining Optimized Association Rules with Categorical and Numeric Attributes

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Concise Representation of Frequent Patterns Based on Disjunction-Free Generators

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Mining Constrained Association Rules to Predict Heart Disease

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Efficiently Mining Maximal Frequent Itemsets

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
The Representative Basis for Association Rules

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Generalized Association Rules

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Analyzing and Predicting Images Through a Neural Network Approach

VBC '96 Proceedings of the 4th International Conference on Visualization in Biomedical Computing
Pushing Support Constraints Into Association Rules Mining

IEEE Transactions on Knowledge and Data Engineering
Generating an informative cover for association rules

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Mining Bases for Association Rules Using Closed Sets

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Exploratory medical knowledge discovery: experiences and issues

ACM SIGKDD Explorations Newsletter
DBC: a condensed representation of frequent patterns for efficient mining

Information Systems
Reducing borders of k-disjunction free representations of frequent patterns

Proceedings of the 2004 ACM symposium on Applied computing
Efficient closed pattern mining in the presence of tough block constraints

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Reducing rule covers with deterministic error bounds

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Mining association rules with improved semantics in medical databases

Artificial Intelligence in Medicine

Comparing association rules and decision trees for disease prediction

HIKM '06 Proceedings of the international workshop on Healthcare information and knowledge management
A new in-network data reduction mechanism to gather data for mining wireless sensor networks

Proceedings of the 10th ACM Symposium on Modeling, analysis, and simulation of wireless and mobile systems
An information-theoretic approach to quantitative association rule mining

Knowledge and Information Systems
Protecting business intelligence and customer privacy while outsourcing data mining tasks

Knowledge and Information Systems
Efficient OLAP with UDFs

Proceedings of the ACM 11th international workshop on Data warehousing and OLAP
Efficient discovery of risk patterns in medical data

Artificial Intelligence in Medicine
Exploration and visualization of OLAP cubes with statistical tests

Proceedings of the ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery: Integrating Automated Analysis with Interactive Exploration
Supporting content-based image retrieval and computer-aided diagnosis systems with association rule-based techniques

Data & Knowledge Engineering
Evaluating statistical tests on OLAP cubes to compare degree of disease

IEEE Transactions on Information Technology in Biomedicine - Special section on computational intelligence in medical systems
Improving CBIR using feature extraction based on wavelet transform

Proceedings of the 14th Brazilian Symposium on Multimedia and the Web
Mining fuzzy association rules from uncertain data

Knowledge and Information Systems
Cube based summaries of large association rule sets

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Evaluating association rules and decision trees to predict multiple target attributes

Intelligent Data Analysis
Interactive exploration and visualization of OLAP cubes

Proceedings of the ACM 14th international workshop on Data Warehousing and OLAP
High performance evaluation of evolutionary-mined association rules on GPUs

The Journal of Supercomputing
Key roles of closed sets and minimal generators in concise representations of frequent patterns

Intelligent Data Analysis
Discovering frequent pattern pairs

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Association rules are a data mining technique used to discover frequent patterns in a data set. In this work, association rules are used in the medical domain, where data sets are generally high dimensional and small. The chief disadvantage about mining association rules in a high dimensional data set is the huge number of patterns that are discovered, most of which are irrelevant or redundant. Several constraints are proposed for filtering purposes, since our aim is to discover only significant association rules and accelerate the search process. A greedy algorithm is introduced to compute rule covers in order to summarize rules having the same consequent. The significance of association rules is evaluated using three metrics: support, confidence and lift. Experiments focus on discovering association rules on a real data set to predict absence or existence of heart disease. Constraints are shown to significantly reduce the number of discovered rules and improve running time. Rule covers summarize a large number of rules by producing a succinct set of rules with high-quality metrics.