How good are convex hull algorithms?
Computational Geometry: Theory and Applications
Introduction to algorithms
Comparing clusterings---an information based distance
Journal of Multivariate Analysis
Expert Systems with Applications: An International Journal
Neural network aided breast cancer detection and diagnosis
NN'06 Proceedings of the 7th WSEAS International Conference on Neural Networks
Towards theory of generic Principal Component Analysis
Journal of Multivariate Analysis
Computers and Industrial Engineering
Handbook of Partial Least Squares: Concepts, Methods and Applications
Handbook of Partial Least Squares: Concepts, Methods and Applications
Mining microarray data to predict the histological grade of a breast cancer
Journal of Biomedical Informatics
Overview and recent advances in partial least squares
SLSFS'05 Proceedings of the 2005 international conference on Subspace, Latent Structure and Feature Selection
Journal of Biomedical Informatics
Hi-index | 0.00 |
Clinical tests and epidemiological studies often produce large amounts of data, being multivariate in nature. The respective analysis is, in most cases, of importance comparable to the clinical and sampling tasks. Simple, easily interpretable techniques from chemometrics provide most of the ingredients to carry out this analysis. We have selected available data from different sources pertaining to cancer diagnosis and incidence: (1) cytological diagnosis of breast cancer, (2) classification of breast tissues through parameters obtained from impedance spectra and (3) distribution of new cancer cases in the United States. Hierarchical cluster analysis (HCA) is needed especially in cases where there is no a priori identification of classes, suggesting a structure of the data based on clusters. These clusters or the classes, are then further detailed and rationalized by principal component analysis (PCA). Partial least squares (PLS) and linear discriminant analysis (LDA) provide further insight into the systems. An additional step for understanding the data set is the removal of less characteristic data (NR) using a density-based approach, so as to make it more clearly defined. Results clearly reveal that breast cytology diagnosis relies on variables conveying mostly the same type of information, being thus interchangeable in nature. In the study on tissue characterization by electrical measurements, the distribution of the different types of tissues can be easily constructed. Finally, the distribution of new cancer cases possesses clear, easily unravelled, geographical patterns.