Cause-effect relationships and partially defined Boolean functions
Annals of Operations Research
An Implementation of Logical Analysis of Data
IEEE Transactions on Knowledge and Data Engineering
An introduction to variable and feature selection
The Journal of Machine Learning Research
Accelerated algorithm for pattern detection in logical analysis of data
Discrete Applied Mathematics - Special issue: Discrete mathematics & data mining II (DM & DM II)
Comprehensive vs. comprehensible classifiers in logical analysis of data
Discrete Applied Mathematics
Artificial Intelligence in Medicine
A Robust Meta-classification Strategy for Cancer Diagnosis from Gene Expression Data
CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
Logical analysis of data --- the vision of Peter L. Hammer
Annals of Mathematics and Artificial Intelligence
MILP approach to pattern generation in logical analysis of data
Discrete Applied Mathematics
Formal concept analysis for the identification of combinatorial biomarkers in breast cancer
ICFCA'08 Proceedings of the 6th international conference on Formal concept analysis
Hi-index | 0.00 |
Objective:: The goal of this study is to re-examine the oligonucleotide microarray dataset of Shipp et al. (www.genome.wi.mit.du/MPR/lymphoma), which contains the intensity levels of 6817 genes of 58 patients with diffuse large B-cell lymphoma (DLBCL) and 19 with follicular lymphoma (FL), by means of the combinatorics, optimisation, and logic-based methodology of logical analysis of data (LAD). The motivations for this new analysis included the previously demonstrated capabilities of LAD and its expected potential (1) to identify different informative genes than those discovered by conventional statistical methods, (2) to identify combinations of gene expression levels capable of characterizing different types of lymphoma, and (3) to assemble collections of such combinations that if considered jointly are capable of accurately distinguishing different types of lymphoma. Methods and materials:: The central concept of LAD is a pattern or combinatorial biomarker, a concept that resembles a rule as used in decision tree methods. LAD is able to exhaustively generate the collection of all those patterns which satisfy certain quality constraints, through a systematic combinatorial process guided by clear optimization criteria. Then, based on a set covering approach, LAD aggregates the collection of patterns into classification models. In addition, LAD is able to use the information provided by large collections of patterns in order to extract subsets of variables, which collectively are able to distinguish between different types of disease. Results:: For the differential diagnosis of DLBCL versus FL, a model based on eight significant genes is constructed and shown to have a sensitivity of 94.7% and a specificity of 100% on the test set. For the prognosis of good versus poor outcome among the DLBCL patients, a model is constructed on another set consisting also of eight significant genes, and shown to have a sensitivity of 87.5% and a specificity of 90% on the test set. The genes selected by LAD also work well as a basis for other kinds of statistical analysis, indicating their robustness. Conclusion:: These two models exhibit accuracies that compare favorably to those in the original study. In addition, the current study also provides a ranking by importance of the genes in the selected significant subsets as well as a library of dozens of combinatorial biomarkers (i.e. pairs or triplets of genes) that can serve as a source of mathematically generated, statistically significant research hypotheses in need of biological explanation.