Multi-platform gene-expression mining and marker gene analysis

Authors:
Qian Xu;Hong Xue;Qiang Yang
Affiliations:
Bioengineering Programme, Hong Kong University of Science and Technology, Clearwater Bay, Kowloon, Hong Kong.;Department of Biochemistry, Hong Kong University of Science and Technology, Clearwater Bay, Kowloon, Hong Kong.;Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Clearwater Bay, Kowloon, Hong Kong
Venue:
International Journal of Data Mining and Bioinformatics
Year:
2011

Citing 22
Cited 1

Multitask Learning

Machine Learning - Special issue on inductive transfer
Learning to learn

Learning to learn
A clustering algorithm based on graph connectivity

Information Processing Letters
A Hierarchical Bayes Model of Primary and Secondary Demand

Marketing Science
Analysis and visualization of gene expression data using self-organizing maps

Neural Networks - New developments in self-organizing maps
Boosting and Microarray Data

Machine Learning
Analysis and Visualization of Gene Expression Microarray Data in Human Cancer Using Self-Organizing Maps

Machine Learning
Task clustering and gating for bayesian multitask learning

The Journal of Machine Learning Research
Microarray data mining: facing the challenges

ACM SIGKDD Explorations Newsletter
Dimension Reduction-Based Penalized Logistic Regression for Cancer Classification Using Microarray Data

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Learning Multiple Tasks with Kernel Methods

The Journal of Machine Learning Research
A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression

Bioinformatics
Learning Gaussian processes from multiple tasks

ICML '05 Proceedings of the 22nd international conference on Machine learning
On Learning Vector-Valued Functions

Neural Computation
A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data

The Journal of Machine Learning Research
Logistic regression for disease classification using microarray data

Bioinformatics
Merging two gene-expression studies via cross-platform normalization

Bioinformatics
Multi-task learning for HIV therapy screening

Proceedings of the 25th international conference on Machine learning
Knowledge-based gene expression classification via matrix factorization

Bioinformatics
Microarray-based classification and clinical predictors

Bioinformatics
An Improved Multi-task Learning Approach with Applications in Medical Diagnosis

ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
A model of inductive bias learning

Journal of Artificial Intelligence Research

Clinical and molecular models of Glioblastoma multiforme survival

International Journal of Data Mining and Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Gene-expression data are now widely available and used for a wide range of clinical and diagnostic purposes. A key challenge is to select a few significant marker genes for biological studies. While it is feasible to find important genes from a single gene-expression data set, it is often more meaningful to compare the results from different but related data sets together, especially for multiple gene-expression data sets arising from different studies of a common organism or phenotype. In this paper, we present a novel framework to exploit the commonalities across different data sets by jointly learning from different data sets simultaneously through multi-task feature learning. By identifying a common subspace of genes, we can help biologists find important marker genes that span different evolutionary periods in the life cycle of cancer development. The genes thus found are more stable and more significant. Our experimental results demonstrate that more accurate models can be built using multiple data sets based on fewer labelled examples. To the best of our knowledge, we are among the first to introduce multi-task learning in the bioinformatics community to solve the lack of data problem.