Introduction to Probability and Statistics: Principles and Applications for Engineering and the Computing Sciences
Compact and understandable descriptions of mixtures of Bernoulli distributions
IDA'07 Proceedings of the 7th international conference on Intelligent data analysis
Patterns from multiresolution 0-1 data
Proceedings of the ACM SIGKDD Workshop on Useful Patterns
Preservation of statistically significant patterns in multiresolution 0-1 data
PRIB'10 Proceedings of the 5th IAPR international conference on Pattern recognition in bioinformatics
Mixture modeling of gait patterns from sensor data
Proceedings of the 5th International Conference on PErvasive Technologies Related to Assistive Environments
Hi-index | 0.01 |
DNA copy number amplifications are hallmarks of many cancers. In this work we analyzed data of genome-wide DNA copy number amplifications collected from more than 4500 neoplasm cases. Based on the 0-1 representation of the data, we trained finite mixtures of multivariate Bernoulli distributions using the EM algorithm to describe the inherent structure in the data. The resulting component distributions of the mixtures of Bernoulli distributions yielded plausible and localized amplification patterns. Individual amplification patterns were tested for their role in cancer groups formed with known risk associations. Our detailed analysis of chromosome 1 showed that asbestos-exposure related and hormonal imbalance-associated cancers were clustered and specific chromosome bands, 1p34 and 1q42, were identified. These sites contain cancer genes, which might explain the condition-specific selection of these loci for amplification.