Automated learning of decision rules for text categorization
ACM Transactions on Information Systems (TOIS)
Towards language independent automated learning of text categorization models
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Making large-scale support vector machine learning practical
Advances in kernel methods
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Effective Methods for Improving Naive Bayes Text Classifiers
PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
A Multilingual Text Mining Approach Based on Self-Organizing Maps
Applied Intelligence
On Machine Learning Methods for Chinese Document Categorization
Applied Intelligence
Authorship Attribution with Support Vector Machines
Applied Intelligence
Text categorization using weight adjusted k-nearest neighbor classification (information retrieval)
Text categorization using weight adjusted k-nearest neighbor classification (information retrieval)
Fast and accurate text classification via multiple linear discriminant projections
The VLDB Journal — The International Journal on Very Large Data Bases
Spam filters: bayes vs. chi-squared; letters vs. words
ISICT '03 Proceedings of the 1st international symposium on Information and communication technologies
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Query dependent ranking using K-nearest neighbor
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Text categorization via generalized discriminant analysis
Information Processing and Management: an International Journal
Semi-supervised Classification from Discriminative Random Walks
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
IEEE Transactions on Knowledge and Data Engineering
Estimation of individual prediction reliability using the local sensitivity analysis
Applied Intelligence
Text classification from unlabeled documents with bootstrapping and feature projection techniques
Information Processing and Management: an International Journal
Feature selection for text classification with Naïve Bayes
Expert Systems with Applications: An International Journal
Distributional Features for Text Categorization
IEEE Transactions on Knowledge and Data Engineering
Using the self organizing map for clustering of text documents
Expert Systems with Applications: An International Journal
A new maximal-margin spherical-structured multi-class support vector machine
Applied Intelligence
Automatically computed document dependent weighting factor facility for Naïve Bayes classification
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
LDA/SVM driven nearest neighbor classification
IEEE Transactions on Neural Networks
Expert Systems with Applications: An International Journal
The decomposed k-nearest neighbor algorithm for imbalanced text classification
FGIT'12 Proceedings of the 4th international conference on Future Generation Information Technology
Class-indexing-based term weighting for automatic text classification
Information Sciences: an International Journal
Automated crime report analysis and classification for e-government and decision support
Proceedings of the 14th Annual International Conference on Digital Government Research
Global geometric similarity scheme for feature selection in fault diagnosis
Expert Systems with Applications: An International Journal
Hi-index | 12.05 |
This work implements a new text document classifier by integrating the K-nearest neighbor (KNN) classification approach with the support vector machine (SVM) training algorithm. The proposed Nearest Neighbor-Support Vector Machine hybrid classification approach is coined as SVM-NN. The KNN has been reported as one of the widely used text classification approaches due to its simplicity and efficiency in handling various types of text classification tasks. However, there exists a major problem of the KNN in determining the appropriate value for parameter K in order to guarantee high classification effectiveness. This is due to the fact that the selection of the value of parameter K has high impact on the accuracy of the KNN classifier. Other than determining the optimal value of parameter K, the KNN is also a lazy learning method which keeps the entire training samples until classification time. Hence, the computational process of the KNN has become intensive when the value of parameter K increases. In this paper, we propose the SVM-NN hybrid classification approach with the objective that to minimize the impact of parameter on classification accuracy. In the training stage, the SVM is utilized to reduce the training samples for each of the available categories to their support vectors (SVs). The SVs from different categories are used as the training data of nearest neighbor classification algorithm in which the Euclidean distance function is used to calculate the average distance between the testing data point to each set of SVs of different categories. The classification decision is made based on the category which has the shortest average distance between its SVs and the testing data point. The experiments on several benchmark text datasets show that the classification accuracy of the SVM-NN approach has low impact on the value of parameter, as compared to the conventional KNN classification model.