Logical analysis of diffuse large B-cell lymphomas

Authors:
G. Alexe;S. Alexe;D. E. Axelrod;P. L. Hammer;D. Weissmann
Affiliations:
Center for Systems Biology, Institute for Advanced Study, Einstein Drive, Princeton, NJ 08540, USA and RUTCOR, Rutgers, The State University of New Jersey, 640 Bartholomew Road, Piscataway, NJ 088 ...;RUTCOR, Rutgers, The State University of New Jersey, 640 Bartholomew Road, Piscataway, NJ 08854, USA;Department of Genetics, Rutgers, The State University of New Jersey, 604 Allison Road, Piscataway, NJ 08854, USA;RUTCOR, Rutgers, The State University of New Jersey, 640 Bartholomew Road, Piscataway, NJ 08854, USA;Department of Pathology, Robert Wood Johnson University Hospital, One Robert Wood Johnson Place, New Brunswick, NJ 08903, USA
Venue:
Artificial Intelligence in Medicine
Year:
2005

Citing 6
Cited 4

Cause-effect relationships and partially defined Boolean functions

Annals of Operations Research
An Implementation of Logical Analysis of Data

IEEE Transactions on Knowledge and Data Engineering
An introduction to variable and feature selection

The Journal of Machine Learning Research
Accelerated algorithm for pattern detection in logical analysis of data

Discrete Applied Mathematics - Special issue: Discrete mathematics & data mining II (DM & DM II)
Comprehensive vs. comprehensible classifiers in logical analysis of data

Discrete Applied Mathematics
Gene expression data analysis of human lymphoma using support vector machines and output coding ensembles

Artificial Intelligence in Medicine

A Robust Meta-classification Strategy for Cancer Diagnosis from Gene Expression Data

CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
Logical analysis of data --- the vision of Peter L. Hammer

Annals of Mathematics and Artificial Intelligence
MILP approach to pattern generation in logical analysis of data

Discrete Applied Mathematics
Formal concept analysis for the identification of combinatorial biomarkers in breast cancer

ICFCA'08 Proceedings of the 6th international conference on Formal concept analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Objective:: The goal of this study is to re-examine the oligonucleotide microarray dataset of Shipp et al. (www.genome.wi.mit.du/MPR/lymphoma), which contains the intensity levels of 6817 genes of 58 patients with diffuse large B-cell lymphoma (DLBCL) and 19 with follicular lymphoma (FL), by means of the combinatorics, optimisation, and logic-based methodology of logical analysis of data (LAD). The motivations for this new analysis included the previously demonstrated capabilities of LAD and its expected potential (1) to identify different informative genes than those discovered by conventional statistical methods, (2) to identify combinations of gene expression levels capable of characterizing different types of lymphoma, and (3) to assemble collections of such combinations that if considered jointly are capable of accurately distinguishing different types of lymphoma. Methods and materials:: The central concept of LAD is a pattern or combinatorial biomarker, a concept that resembles a rule as used in decision tree methods. LAD is able to exhaustively generate the collection of all those patterns which satisfy certain quality constraints, through a systematic combinatorial process guided by clear optimization criteria. Then, based on a set covering approach, LAD aggregates the collection of patterns into classification models. In addition, LAD is able to use the information provided by large collections of patterns in order to extract subsets of variables, which collectively are able to distinguish between different types of disease. Results:: For the differential diagnosis of DLBCL versus FL, a model based on eight significant genes is constructed and shown to have a sensitivity of 94.7% and a specificity of 100% on the test set. For the prognosis of good versus poor outcome among the DLBCL patients, a model is constructed on another set consisting also of eight significant genes, and shown to have a sensitivity of 87.5% and a specificity of 90% on the test set. The genes selected by LAD also work well as a basis for other kinds of statistical analysis, indicating their robustness. Conclusion:: These two models exhibit accuracies that compare favorably to those in the original study. In addition, the current study also provides a ranking by importance of the genes in the selected significant subsets as well as a library of dozens of combinatorial biomarkers (i.e. pairs or triplets of genes) that can serve as a source of mathematically generated, statistically significant research hypotheses in need of biological explanation.