ACM Computing Surveys (CSUR)
Restructuring sparse high dimensional data for effective retrieval
Proceedings of the 1998 conference on Advances in neural information processing systems II
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Relationship-based clustering and cluster ensembles for high-dimensional data mining
Relationship-based clustering and cluster ensembles for high-dimensional data mining
Features for unsupervised document classification
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Dual fuzzy-possibilistic coclustering for categorization of documents
IEEE Transactions on Fuzzy Systems
Generalized cluster aggregation
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Improving document clustering using Okapi BM25 feature weighting
Information Retrieval
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Effective measures for inter-document similarity
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
The performance of document clustering systems depends on employing optimal text representations, which are not only difficult to determine beforehand, but also may vary from one clustering problem to another. As a first step towards building robust document clusterers, a strategy based on feature diversity and cluster ensembles is presented in this work. Experiments conducted on a binary clustering problem show that our method is robust to near-optimal model order selection and able to detect constructive interactions between different document representations in the test bed.