What every computer scientist should know about floating-point arithmetic
ACM Computing Surveys (CSUR)
Elements of information theory
Elements of information theory
The nature of statistical learning theory
The nature of statistical learning theory
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
On feature distributional clustering for text categorization
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine Learning
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality
Data Mining and Knowledge Discovery
Concept Decompositions for Large Sparse Text Data Using Clustering
Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Using Taxonomy, Discriminants, and Signatures for Navigating in Text Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
IEEE Transactions on Information Theory
Text Mining with Information-Theoretic Clustering
Computing in Science and Engineering
A practical web-based approach to generating topic hierarchy for text segments
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Data Driven Similarity Measures for k-Means Like Clustering Algorithms
Information Retrieval
Adaptive sampling for thresholding in document filtering and classification
Information Processing and Management: an International Journal
On the use of linear programming for unsupervised text classification
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Taxonomy generation for text segments: A practical web-based approach
ACM Transactions on Information Systems (TOIS)
Building implicit links from content for forum search
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Semi-supervised model-based document clustering: A comparative study
Machine Learning
Exploiting asymmetry in hierarchical topic extraction
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
A new feature selection score for multinomial naive Bayes text classification based on KL-divergence
ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
A semi-supervised feature clustering algorithm with application to word sense disambiguation
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Addressing diverse user preferences in SQL-query-result navigation
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Dynamic category profiling for text filtering and classification
Information Processing and Management: an International Journal
Co-clustering based classification for out-of-domain documents
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Interactive high-quality text classification
Information Processing and Management: an International Journal
Can chinese web pages be classified with english data source?
Proceedings of the 17th international conference on World Wide Web
A heuristic algorithm for clustering rooted ordered trees
Intelligent Data Analysis
Visual explanation of evidence in additive classifiers
IAAI'06 Proceedings of the 18th conference on Innovative applications of artificial intelligence - Volume 2
Graph-based word clustering using a web search engine
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Adaptive email spam filtering based on information theory
WISE'07 Proceedings of the 8th international conference on Web information systems engineering
Automatically computed document dependent weighting factor facility for Naïve Bayes classification
Expert Systems with Applications: An International Journal
Long distance bigram models applied to word clustering
Pattern Recognition
EURASIP Journal on Audio, Speech, and Music Processing
Cluster based symbolic representation and feature selection for text classification
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
Dissimilarity based feature selection for text classification: a cluster based approach
Proceedings of the International Conference & Workshop on Emerging Trends in Technology
Expert Systems with Applications: An International Journal
A divergence-oriented approach for web users clustering
ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part II
Active learning for probability estimation using jensen-shannon divergence
ECML'05 Proceedings of the 16th European conference on Machine Learning
Journal of Intelligent Information Systems
Sensor selection to support practical use of health-monitoring smart environments
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
ICIRA'12 Proceedings of the 5th international conference on Intelligent Robotics and Applications - Volume Part III
p-PIC: Parallel power iteration clustering for big data
Journal of Parallel and Distributed Computing
The curse of 140 characters: evaluating the efficacy of SMS spam detection on android
Proceedings of the Third ACM workshop on Security and privacy in smartphones & mobile devices
Multi-document text summarization using topic model and fuzzy logic
MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Hi-index | 0.00 |
In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering" of features has been found to achieve improvements over feature selection in terms of classification accuracy, especially at lower number of features [2, 28]. However the existing clustering techniques are agglomerative in nature and result in (i) sub-optimal word clusters and (ii) high computational cost. In order to explicitly capture the optimality of word clusters in an information theoretic framework, we first derive a global criterion for feature clustering. We then present a fast, divisive algorithm that monotonically decreases this objective function value, thus converging to a local minimum. We show that our algorithm minimizes the "within-cluster Jensen-Shannon divergence" while simultaneously maximizing the "between-cluster Jensen-Shannon divergence". In comparison to the previously proposed agglomerative strategies our divisive algorithm achieves higher classification accuracy especially at lower number of features. We further show that feature clustering is an effective technique for building smaller class models in hierarchical classification. We present detailed experimental results using Naive Bayes and Support Vector Machines on the 20 Newsgroups data set and a 3-level hierarchy of HTML documents collected from Dmoz Open Directory.