Numerical recipes: the art of scientific computing
Numerical recipes: the art of scientific computing
Multidimensional similarity structure analysis
Multidimensional similarity structure analysis
Numerical recipes in C (2nd ed.): the art of scientific computing
Numerical recipes in C (2nd ed.): the art of scientific computing
A worldwide flock of Condors: load sharing among workstation clusters
Future Generation Computer Systems - Special issue: resource management in distributed systems
Condor: a distributed job scheduler
Beowulf cluster computing with Linux
Unsupervised Rough Set Classification Using GAs
Journal of Intelligent Information Systems
Rough Sets: Theoretical Aspects of Reasoning about Data
Rough Sets: Theoretical Aspects of Reasoning about Data
Clustering Algorithms
Dynamic Reducts as a Tool for Extracting Laws from Decisions Tables
ISMIS '94 Proceedings of the 8th International Symposium on Methodologies for Intelligent Systems
Time Complexity of Rough Clustering: GAs versus K-Means
TSCTC '02 Proceedings of the Third International Conference on Rough Sets and Current Trends in Computing
Distributed computing in practice: the Condor experience: Research Articles
Concurrency and Computation: Practice & Experience - Grid Performance
The Anatomy of the Grid: Enabling Scalable Virtual Organizations
International Journal of High Performance Computing Applications
Gene discovery in leukemia revisited: a computational intelligence perspective
IEA/AIE'2004 Proceedings of the 17th international conference on Innovations in applied artificial intelligence
Ensembles of Classifiers Based on Approximate Reducts
Fundamenta Informaticae - Concurrency Specification and Programming (CS&P'2000)
A Nonlinear Mapping for Data Structure Analysis
IEEE Transactions on Computers
Hi-index | 0.00 |
In many domains the data objects are described in terms of a large number of features (e.g. microarray experiments, or spectral characterizations of organic and inorganic samples). A pipelined approach using two clustering algorithms in combination with Rough Sets is investigated for the purpose of discovering important combinations of attributes in high dimensional data. The Leader and several k-means algorithms are used as fast procedures for attribute set simplification of the information systems presented to the rough sets algorithms. The data described in terms of these fewer features are then discretized with respect to the decision attribute according to different rough set based schemes. From them, the reducts and their derived rules are extracted, which are applied to test data in order to evaluate the resulting classification accuracy in crossvalidation experiments. The data mining process is implemented within a high throughput distributed computing environment. Nonlinear transformation of attribute subsets preserving the similarity structure of the data were also investigated. Their classification ability, and that of subsets of attributes obtained after the mining process were described in terms of analytic functions obtained by genetic programming (gene expression programming), and simplified using computer algebra systems. Visual data mining techniques using virtual reality were used for inspecting results. An exploration of this approach (using Leukemia, Colon cancer and Breast cancer gene expression data) was conducted in a series of experiments. They led to small subsets of genes with high discrimination power.