Rough-Fuzzy C-Medoids Algorithm and Selection of Bio-Basis for Amino Acid Sequence Analysis

Authors:
Pradipta Maji;Sankar K. Pal
Affiliations:
-;IEEE
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2007

Citing 12
Cited 10

Artificial neural network model for predicting HIV protease cleavage sites in protein

Advances in Engineering Software
Rough Sets: Theoretical Aspects of Reasoning about Data

Rough Sets: Theoretical Aspects of Reasoning about Data
Rough-Fuzzy MLP: Modular Evolution, Rule Generation, and Evaluation

IEEE Transactions on Knowledge and Data Engineering
Interval Set Clustering of Web Users with Rough K-Means

Journal of Intelligent Information Systems
Rough Self Organizing Map

Applied Intelligence
A Rough Set Theoretic Approach to Clustering

Fundamenta Informaticae
Rapid and brief communication: Rough support vector clustering

Pattern Recognition
Rough–Fuzzy Collaborative Clustering

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Low-complexity fuzzy relational clustering algorithms for Web mining

IEEE Transactions on Fuzzy Systems
Brief communication: Reduced bio basis function neural network for identification of protein phosphorylation sites: comparison with pattern recognition algorithms

Computational Biology and Chemistry
Rough fuzzy MLP: knowledge encoding and classification

IEEE Transactions on Neural Networks
Bio-basis function neural network for prediction of protease cleavage sites in proteins

IEEE Transactions on Neural Networks

RFCM: A Hybrid Clustering Algorithm Using Rough and Fuzzy Sets

Fundamenta Informaticae
Maximum Class Separability for Rough-Fuzzy C-Means Based Brain MR Image Segmentation

Transactions on Rough Sets IX
A fast approach to attribute reduction in incomplete decision systems with tolerance relation-based rough sets

Information Sciences: an International Journal
Kernelized Fuzzy Rough Sets

RSKT '09 Proceedings of the 4th International Conference on Rough Sets and Knowledge Technology
Rough-fuzzy knowledge encoding and uncertainty analysis: relevance in data mining

ICDCN'08 Proceedings of the 9th international conference on Distributed computing and networking
Fuzzy-rough sets for information measures and selection of relevant genes from microarray data

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on game theory
Temporal data mining using shape space representations of time series

Neurocomputing
Computational theory perception (CTP), rough-fuzzy uncertainty analysis and mining in bioinformatics and web intelligence: a unified framework

Transactions on Rough Sets XI
RFCM: A Hybrid Clustering Algorithm Using Rough and Fuzzy Sets

Fundamenta Informaticae
On fuzzy-rough attribute selection: Criteria of Max-Dependency, Max-Relevance, Min-Redundancy, and Max-Significance

Applied Soft Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In most pattern recognition algorithms, amino acids cannot be used directly as inputs since they are nonnumerical variables. They, therefore, need encoding prior to input. In this regard, bio-basis function maps a nonnumerical sequence space to a numerical feature space. It is designed using an amino acid mutation matrix. One of the important issues for the bio-basis function is how to select the minimum set of bio-bases with maximum information. In this paper, we describe an algorithm, termed as rough-fuzzy c{\hbox{-}}{\rm{medoids}} (RFCMdd) algorithm, to select the most informative bio-bases. It is comprised of a judicious integration of the principles of rough sets, fuzzy sets, the c{\hbox{-}}{\rm{medoids}} algorithm, and the amino acid mutation matrix. While the membership function of fuzzy sets enables efficient handling of overlapping partitions, the concept of lower and upper bounds of rough sets deals with uncertainty, vagueness, and incompleteness in class definition. The concept of crisp lower bound and fuzzy boundary of a class, introduced in RFCMdd, enables efficient selection of the minimum set of the most informative bio-bases. Some new indices are introduced for evaluating quantitatively the quality of selected bio-bases. The effectiveness of the proposed algorithm, along with a comparison with other algorithms, has been demonstrated on different types of protein data sets.