Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Text databases & document management
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
The use of bigrams to enhance text categorization
Information Processing and Management: an International Journal
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Practical solutions to the problem of diagonal dominance in kernel document clustering
ICML '06 Proceedings of the 23rd international conference on Machine learning
Seeding the survey and analysis of research literature with text mining
Expert Systems with Applications: An International Journal
The Chinese text categorization system with association rule and category priority
Expert Systems with Applications: An International Journal
Text Categorization Based on LDA and SVM
CSSE '08 Proceedings of the 2008 International Conference on Computer Science and Software Engineering - Volume 01
A Fusion of Multiple Classifiers Approach Based on Reliability function for Text Categorization
FSKD '08 Proceedings of the 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 02
Olex: Effective Rule Learning for Text Categorization
IEEE Transactions on Knowledge and Data Engineering
A Survey on Text Classification Techniques for E-mail Filtering
ICMLC '10 Proceedings of the 2010 Second International Conference on Machine Learning and Computing
A comparative study of TF*IDF, LSI and multi-words for text classification
Expert Systems with Applications: An International Journal
Identifying Themes in Social Media and Detecting Sentiments
ASONAM '10 Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining
Using chi-square statistics to measure similarities for text categorization
Expert Systems with Applications: An International Journal
Cross-lingual text categorization: Conquering language boundaries in globalized environments
Information Processing and Management: an International Journal
Hi-index | 0.00 |
Text categorization, the assignment of text documents to one or more pre-defined categories, is one of the most intensely researched text mining tasks. The task may be subdivided into two main parts: the representation of the text documents by some form of a numerical vector space, and the application of a suitable supervised learning technique. This research is focused on the second part of the problem. The work presented in this paper proposes the construction of a classification model for each of the (pre-defined) categories or themes present in a corpus using a term-frequency based 'keyword' identification and document scoring technique. The documents misclassified by each of these (category-specific) classifier models are then re-classified with the help of the other models. The effectiveness of the approach is demonstrated by experiments on two publicly available BBC News corpuses. Good classification accuracy is observed for each of the two corpuses. Specifically, the macro-averaged and micro-averaged F-measures of the proposed method (on evaluation the dataset) for the BBC Sports corpus are 94.7% and 94.3% respectively.