Concept decompositions for large sparse text data using clustering
Machine Learning
Similarity Indexing with the SS-tree
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Center CLICK: A Clustering Algorithm with Applications to Gene Expression Analysis
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
A needle in a haystack: local one-class optimization
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Estimating the Support of a High-Dimensional Distribution
Neural Computation
Bregman bubble clustering: A robust framework for mining dense clusters
ACM Transactions on Knowledge Discovery from Data (TKDD)
One-class clustering in the text domain
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Isolating top-k dense regions with filtration of sparse background
Pattern Recognition Letters
Dense Neighborhoods on Affinity Graph
International Journal of Computer Vision
One-Class multiple instance learning and applications to target tracking
ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part III
CopyCatch: stopping group attacks by spotting lockstep behavior in social networks
Proceedings of the 22nd international conference on World Wide Web
Hi-index | 0.00 |
Unsupervised learning methods often involve summarizing the data using a small number of parameters. In certain domains, only a small subset of the available data is relevant for the problem. One-Class Classification or One-Class Clustering attempts to find a useful subset by locating a dense region in the data. In particular, a recently proposed algorithm called One-Class Information Ball (OC-IB) shows the advantage of modeling a small set of highly coherent points as opposed to pruning outliers. We present several modifications to OC-IB and integrate it with a global search that results in several improvements such as deterministic results, optimality guarantees, control over cluster size and extension to other cost functions. Empirical studies yield significantly better results on various real and artificial data.