APACS: a system for the automatic analysis and classification of conceptual patterns
Computational Intelligence
C4.5: programs for machine learning
C4.5: programs for machine learning
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
An effective hash-based algorithm for mining association rules
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Mining quantitative association rules in large relational tables
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Mining fuzzy association rules
CIKM '97 Proceedings of the sixth international conference on Information and knowledge management
ACM Computing Surveys (CSUR)
Tissue classification with gene expression profiles
RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
Using Bayesian networks to analyze expression data
RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
Mining fuzzy association rules in a database containing relational and transactional data
Data mining and computational intelligence
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Self-Organizing Maps
An Information Theoretic Approach to Rule Induction from Databases
IEEE Transactions on Knowledge and Data Engineering
High-Order Pattern Discovery from Discrete-Valued Data
IEEE Transactions on Knowledge and Data Engineering
Feature selection for high-dimensional genomic microarray data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Classification with Degree of Membership: A Fuzzy Approach
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Biclustering of Expression Data
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
An Interval Classifier for Database Mining Applications
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An Efficient Algorithm for Mining Association Rules in Large Databases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Cancer classification using gene expression data
Information Systems - Special issue: Data management in bioinformatics
Minimum Redundancy Feature Selection from Microarray Gene Expression Data
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Capturing best practice for microarray gene expression data analysis
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Information Theory, Inference & Learning Algorithms
Information Theory, Inference & Learning Algorithms
Gene ranking using bootstrapped P-values
ACM SIGKDD Explorations Newsletter
ACM SIGKDD Explorations Newsletter
Redundancy based feature selection for microarray data
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Biclustering Algorithms for Biological Data Analysis: A Survey
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Cluster Analysis for Gene Expression Data: A Survey
IEEE Transactions on Knowledge and Data Engineering
Information Sciences—Informatics and Computer Science: An International Journal
A global optimal algorithm for class-dependent discretization of continuous data
Intelligent Data Analysis
Pattern discovery: a data driven approach to decision support
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
A novel evolutionary data mining algorithm with applications to churn prediction
IEEE Transactions on Evolutionary Computation
Fuzzy decision trees: issues and methods
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Fuzzy association rules: general model and applications
IEEE Transactions on Fuzzy Systems
Mining fuzzy association rules in a bank-account database
IEEE Transactions on Fuzzy Systems
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Using association patterns for discrete-valed data clustering
AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
Novel Algorithm for Coexpression Detection in Time-Varying Microarray Data Sets
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Stable feature selection via dense feature groups
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Dimensionality reduction for heterogeneous dataset in rushes editing
Pattern Recognition
Spanning Tree Based Attribute Clustering
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
A Cluster-Based Feature Selection Approach
HAIS '09 Proceedings of the 4th International Conference on Hybrid Artificial Intelligence Systems
A statistical approach for selecting discriminative features of spatial regions of interest
Intelligent Data Analysis
Ensemble gene selection by grouping for microarray data classification
Journal of Biomedical Informatics
Ensemble gene selection for cancer classification
Pattern Recognition
Efficient gene selection with rough sets from gene expression data
RSKT'08 Proceedings of the 3rd international conference on Rough sets and knowledge technology
Pattern discovery for large mixed-mode database
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Review Article: Stable feature selection for biomarker discovery
Computational Biology and Chemistry
Gene selection based on mutual information for the classification of multi-class cancer
ICIC'06 Proceedings of the 2006 international conference on Computational Intelligence and Bioinformatics - Volume Part III
Gene selection by cooperative competition clustering
ICIC'06 Proceedings of the 2006 international conference on Computational Intelligence and Bioinformatics - Volume Part III
An unsupervised feature selection framework based on clustering
PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Summarizing categorical data by clustering attributes
Data Mining and Knowledge Discovery
MicroClAn: Microarray clustering analysis
Journal of Parallel and Distributed Computing
Efficient Retrieval Technique for Microarray Gene Expression
International Journal of Information Retrieval Research
Analysing microarray expression data through effective clustering
Information Sciences: an International Journal
Fuzzy clustering with biological knowledge for gene selection
Applied Soft Computing
Review: Knowledge discovery in medicine: Current issue and future trend
Expert Systems with Applications: An International Journal
MaskedPainter: Feature selection for microarray data analysis
Intelligent Data Analysis
Hi-index | 0.00 |
This paper presents an attribute clustering method which is able to group genes based on their interdependence so as to mine meaningful patterns from the gene expression data. It can be used for gene grouping, selection, and classification. The partitioning of a relational table into attribute subgroups allows a small number of attributes within or across the groups to be selected for analysis. By clustering attributes, the search dimension of a data mining algorithm is reduced. The reduction of search dimension is especially important to data mining in gene expression data because such data typically consist of a huge number of genes (attributes) and a small number of gene expression profiles (tuples). Most data mining algorithms are typically developed and optimized to scale to the number of tuples instead of the number of attributes. The situation becomes even worse when the number of attributes overwhelms the number of tuples, in which case, the likelihood of reporting patterns that are actually irrelevant due to chances becomes rather high. It is for the aforementioned reasons that gene grouping and selection are important preprocessing steps for many data mining algorithms to be effective when applied to gene expression data. This paper defines the problem of attribute clustering and introduces a methodology to solving it. Our proposed method groups interdependent attributes into clusters by optimizing a criterion function derived from an information measure that reflects the interdependence between attributes. By applying our algorithm to gene expression data, meaningful clusters of genes are discovered. The grouping of genes based on attribute interdependence within group helps to capture different aspects of gene association patterns in each group. Significant genes selected from each group then contain useful information for gene expression classification and identification. To evaluate the performance of the proposed approach, we applied it to two well-known gene expression data sets and compared our results with those obtained by other methods. Our experiments show that the proposed method is able to find the meaningful clusters of genes. By selecting a subset of genes which have high multiple-interdependence with others within clusters, significant classification information can be obtained. Thus, a small pool of selected genes can be used to build classifiers with very high classification rate. From the pool, gene expressions of different categories can be identified.