Role mining - revealing business roles for security administration using data mining technology
Proceedings of the eighth ACM symposium on Access control models and technologies
Stability-based validation of clustering solutions
Neural Computation
Machine Learning
Aggregating inconsistent information: Ranking and clustering
Journal of the ACM (JACM)
Multi-assignment clustering for Boolean data
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
On the definition of role mining
Proceedings of the 15th ACM symposium on Access control models and technologies
Proceedings of the 15th ACM symposium on Access control models and technologies
Model order selection for boolean matrix factorization
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries
IEEE Transactions on Image Processing
Multi-assignment clustering for boolean data
The Journal of Machine Learning Research
Role Mining with Probabilistic Models
ACM Transactions on Information and System Security (TISSEC)
Hi-index | 0.00 |
The goal of model-order selection is to select a model variant that generalizes best from training data to unseen test data. In unsupervised learning without any labels, the computation of the generalization error of a solution poses a conceptual problem which we address in this paper. We formulate the principle of "minimum transfer costs" for model-order selection. This principle renders the concept of cross-validation applicable to unsupervised learning problems. As a substitute for labels, we introduce a mapping between objects of the training set to objects of the test set enabling the transfer of training solutions. Our method is explained and investigated by applying it to well-known problems such as singular-value decomposition, correlation clustering, Gaussian mixturemodels, and k-means clustering. Our principle finds the optimal model complexity in controlled experiments and in real-world problems such as image denoising, role mining and detection of misconfigurations in access-control data.