Elements of information theory
Elements of information theory
Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A view of the EM algorithm that justifies incremental, sparse, and other variants
Learning in graphical models
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Document clustering using word clusters via the information bottleneck method
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
On feature distributional clustering for text categorization
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval
Multivariate Information Bottleneck
UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Distributional word clusters vs. words for text categorization
The Journal of Machine Learning Research
Information Theoretic Clustering of Sparse Co-Occurrence Data
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Sequential information bottleneck for finite data
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Document clustering based on cluster validation
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Disambiguating Web appearances of people in a social network
WWW '05 Proceedings of the 14th international conference on World Wide Web
Associative Clustering for Exploring Dependencies between Functional Genomics Data Sets
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Multi-way distributional clustering via pairwise interactions
ICML '05 Proceedings of the 22nd international conference on Machine learning
Adaptive topological tree structure for document organisation and visualisation
Neural Networks - 2004 Special issue: New developments in self-organizing systems
Topic-oriented query expansion for web search
Proceedings of the 15th international conference on World Wide Web
A scaleable document clustering approach for large document corpora
Information Processing and Management: an International Journal
A relevance feedback mechanism for cluster-based retrieval
Information Processing and Management: an International Journal
Multivariate information bottleneck
Neural Computation
Visual style: Qualitative and context-dependent categorization
Artificial Intelligence for Engineering Design, Analysis and Manufacturing
Video search reranking via information bottleneck principle
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Relation extraction using label propagation based semi-supervised learning
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Information Processing and Management: an International Journal
Knowledge and Information Systems
A rate-distortion one-class model and its applications to clustering
Proceedings of the 25th international conference on Machine learning
WWW-Newsgroup-Document Clustering by Means of Dynamic Self-organizing Neural Networks
ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
Data weaving: scaling up the state-of-the-art in data clustering
Proceedings of the 17th ACM conference on Information and knowledge management
An extension of PLSA for document clustering
Proceedings of the 17th ACM conference on Information and knowledge management
Text classification from unlabeled documents with bootstrapping and feature projection techniques
Information Processing and Management: an International Journal
The Density-Based Agglomerative Information Bottleneck
PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
External validation measures for K-means clustering: A data distribution perspective
Expert Systems with Applications: An International Journal
Whole-genome prokaryotic clustering based on gene lengths
Discrete Applied Mathematics
Learning non-redundant codebooks for classifying complex objects
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Clustering semantic spaces of suicide notes and newsgroups articles
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
An information theoretic approach to speaker diarization of meeting data
IEEE Transactions on Audio, Speech, and Language Processing
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Improving document clustering in a learned concept space
Information Processing and Management: an International Journal
The multi-view information bottleneck clustering
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Semi-supervised document classification with a mislabeling error model
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Efficient Anonymizations with Enhanced Utility
Transactions on Data Privacy
Multi-view clustering of multilingual documents
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Word sense induction & disambiguation using hierarchical random graphs
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Unsupervised object category discovery via information bottleneck method
Proceedings of the international conference on Multimedia
A graph model for clustering based on mutual information
PRICAI'10 Proceedings of the 11th Pacific Rim international conference on Trends in artificial intelligence
Using local density information to improve IB algorithms
Pattern Recognition Letters
Pattern Recognition Letters
PAC-Bayesian Analysis of Co-clustering and Beyond
The Journal of Machine Learning Research
A graph model for mutual information based clustering
Journal of Intelligent Information Systems
PrIter: a distributed framework for prioritized iterative computations
Proceedings of the 2nd ACM Symposium on Cloud Computing
Strategic and operational planning of bike-sharing systems by data mining: a case study
ICCL'11 Proceedings of the Second international conference on Computational logistics
Visual cue cluster construction via information bottleneck principle and kernel density estimation
CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
Finding the optimal cardinality value for information bottleneck method
ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
A supervised clustering method for text classification
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Motif discovery through predictive modeling of gene regulation
RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology
Visualizing dynamics of the hot topics using sequence-based self-organizing maps
KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part IV
Bag of spatio-temporal synonym sets for human action recognition
MMM'10 Proceedings of the 16th international conference on Advances in Multimedia Modeling
Active online classification via information maximization
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Journal of Intelligent Information Systems
Parallel proximal support vector machine for high-dimensional pattern classification
Proceedings of the 21st ACM international conference on Information and knowledge management
Information Bottleneck with local consistency
PRICAI'12 Proceedings of the 12th Pacific Rim international conference on Trends in Artificial Intelligence
Experiments in automated support for argument reconstruction
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Law
Information theoretic pairwise clustering
SIMBAD'13 Proceedings of the Second international conference on Similarity-Based Pattern Recognition
Predicting students' final performance from participation in on-line discussion forums
Computers & Education
Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing
The multi-feature information bottleneck with application to unsupervised image categorization
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hartigan's K-means versus Lloyd's K-means: is it time for a change?
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Control-flow integrity principles, implementations, and applications
ACM Transactions on Information and System Security (TISSEC)
Hi-index | 0.00 |
We present a novel sequential clustering algorithm which is motivated by the Information Bottleneck (IB) method. In contrast to the agglomerative IB algorithm, the new sequential (sIB) approach is guaranteed to converge to a local maximum of the information with time and space complexity typically linear in the data size. information, as required by the original IB principle. Moreover, the time and space complexity are significantly improved. We apply this algorithm to unsupervised document classification. In our evaluation, on small and medium size corpora, the sIB is found to be consistently superior to all the other clustering methods we examine, typically by a significant margin. Moreover, the sIB results are comparable to those obtained by a supervised Naive Bayes classifier. Finally, we propose a simple procedure for trading cluster's recall to gain higher precision, and show how this approach can extract clusters which match the existing topics of the corpus almost perfectly.