Cluster ensemble and its applications in gene expression analysis
APBC '04 Proceedings of the second conference on Asia-Pacific bioinformatics - Volume 29
Meta-clustering of gene expression data and literature-based information
ACM SIGKDD Explorations Newsletter
Integration of Cluster Ensemble and Text Summarization for Gene Expression Analysis
BIBE '04 Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Clustering Genes Using Gene Expression and Text Literature Data
CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
A knowledge-driven method to evaluate multi-source clustering
ISPA'05 Proceedings of the 2005 international conference on Parallel and Distributed Processing and Applications
Exact test critical values for correlation testing with application
WSEAS Transactions on Mathematics
Web news summarization via soft clustering algorithm
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 7
An improved web information summarization based on SSSC
CAR'10 Proceedings of the 2nd international Asia conference on Informatics in control, automation and robotics - Volume 3
Hi-index | 0.00 |
In recent years, many technologies that are used to analyze genes were proposed. Huge amount of biological databases, such as microarray data, biomedical literatures, sequence data and genome structure data et al., have formed useful data warehouses to mine gene-gene relations and predict the gene networks in advance. In the field of bioinformatics, the clustering of gene expressions is a common technology to extract the new knowledge. However, to raise the accuracy of gene clusters is a challenge because of the errors of biological databases and divergence of various clustering methods. In this paper, Multi-Source Soft Clustering (MSSC), which is an integrated framework of the clustering methods and multi-source databases, is presented to raise the accuracy. Two soft clustering methods, fuzzy c-means and soft CAST, are applied to solve the questions that genes may have multi-functions and involve several biological pathways. Combining microarray data and biomedical literatures to improve the overall accuracy may be better than using only one single dataset. In addition, the MSSC adopts the concept of clustering before integrating, and uses the correlation coefficient in statistics to calculate the distances of the matrices between the diverse soft clustering results. The experimental result shows that MSSC approach can be relatively more effective.