Elements of information theory
Elements of information theory
The nature of statistical learning theory
The nature of statistical learning theory
Machine Learning
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
BoosTexter: A Boosting-based Systemfor Text Categorization
Machine Learning - Special issue on information retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
On the Learnability and Design of Output Codes for Multiclass Problems
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Joining statistics with NLP for text categorization
ANLC '92 Proceedings of the third conference on Applied natural language processing
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Unsupervised document classification using sequential information maximization
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Enhanced word clustering for hierarchical text classification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Text categorization by boosting automatically extracted concepts
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A divisive information theoretic feature clustering algorithm for text classification
The Journal of Machine Learning Research
Index construction for linear categorisation
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Disambiguating Web appearances of people in a social network
WWW '05 Proceedings of the 14th international conference on World Wide Web
Best terms: an efficient feature-selection algorithm for text categorization
Knowledge and Information Systems
A scaleable document clustering approach for large document corpora
Information Processing and Management: an International Journal
On Mining Instance-Centric Classification Rules
IEEE Transactions on Knowledge and Data Engineering
Efficient implementation of associative classifiers for document classification
Information Processing and Management: an International Journal - Special issue: AIRS2005: Information retrieval research in Asia
Text mining techniques for patent analysis
Information Processing and Management: an International Journal
An interactive algorithm for asking and incorporating feature feedback into support vector machines
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Patent surrogate extraction and evaluation in the context of patent mapping
Journal of Information Science
XML Document Classification Using Extended VSM
Focused Access to XML Documents
Data weaving: scaling up the state-of-the-art in data clustering
Proceedings of the 17th ACM conference on Information and knowledge management
Semantic Clustering for a Functional Text Classification Task
CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
Models for association rules based on clustering and correlation
Intelligent Data Analysis
Discovering implicit intention-level knowledge from natural-language texts
Knowledge-Based Systems
Generic title labeling for clustered documents
Expert Systems with Applications: An International Journal
Efficient Text Classification Using Term Projection
AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Effective use of WordNet semantics via kernel-based learning
CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Text categorization using distributional clustering and concept extraction
ICIC'07 Proceedings of the intelligent computing 3rd international conference on Advanced intelligent computing theories and applications
A semantic kernel to exploit linguistic knowledge
AI*IA'05 Proceedings of the 9th conference on Advances in Artificial Intelligence
Practical application of associative classifier for document classification
AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
A term weighting approach for text categorization
AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Systematic construction of hierarchical classifier in SVM-Based text categorization
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Toward generic title generation for clustered documents
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
A study on text clustering algorithms based on frequent term sets
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Automatic word clustering for text categorization using global information
AIRS'04 Proceedings of the 2004 international conference on Asian Information Retrieval Technology
Hi-index | 0.00 |
We describe a text categorization approach that is based on a combination of feature distributional clusters with a support vector machine (SVM) classifier. Our feature selection approach employs distributional clustering of words via the recently introducedinformation bottleneck method, which generates a more efficientword-clusterrepresentation of documents. Combined with the classification power of an SVM, this method yields high performance text categorization that can outperform other recent methods in terms of categorization accuracy and representation efficiency. Comparing the accuracy of our method with other techniques, we observe significant dependency of the results on the data set. We discuss the potential reasons for this dependency.