The nature of statistical learning theory
The nature of statistical learning theory
Neural Networks: A Comprehensive Foundation
Neural Networks: A Comprehensive Foundation
Artificial Intelligence: A Guide to Intelligent Systems
Artificial Intelligence: A Guide to Intelligent Systems
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Effective Methods for Improving Naive Bayes Text Classifiers
PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Using sets of feature vectors for similarity search on voxelized CAD objects
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Fast and accurate text classification via multiple linear discriminant projections
The VLDB Journal — The International Journal on Very Large Data Bases
Spam filters: bayes vs. chi-squared; letters vs. words
ISICT '03 Proceedings of the 1st international symposium on Information and communication technologies
Mining rare and frequent events in multi-camera surveillance video using self-organizing maps
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
An Assessment of Case-Based Reasoning for Spam Filtering
Artificial Intelligence Review
A case-based technique for tracking concept drift in spam filtering
Knowledge-Based Systems
Email categorization with tournament methods
NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Expert Systems with Applications: An International Journal
Clustering Indian stock market data for portfolio management
Expert Systems with Applications: An International Journal
Partition-conditional ICA for Bayesian classification of microarray data
Expert Systems with Applications: An International Journal
Automatically computed document dependent weighting factor facility for Naïve Bayes classification
Expert Systems with Applications: An International Journal
A clustering study of a 7000 EU document inventory using MDS and SOM
Expert Systems with Applications: An International Journal
Research of fast SOM clustering for text information
Expert Systems with Applications: An International Journal
A semi-supervised tool for clustering accounting databases with applications to internal controls
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
Fast growing self organizing map for text clustering
ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
Expert Systems with Applications: An International Journal
Automated crime report analysis and classification for e-government and decision support
Proceedings of the 14th Annual International Conference on Digital Government Research
Expert Systems with Applications: An International Journal
Hi-index | 12.07 |
An increasing number of computational and statistical approaches have been used for text classification, including nearest-neighbor classification, naive Bayes classification, support vector machines, decision tree induction, rule induction, and artificial neural networks. Among these approaches, naive Bayes classifiers have been widely used because of its simplicity. Due to the simplicity of the Bayes formula, the naive Bayes classification algorithm requires a relatively small number of training data and shorter time in both the training and classification stages as compared to other classifiers. However, a major short coming of this technique is the fact that the classifier will pick the highest probability category as the one to which the document is annotated too. Doing this is tantamount to classifying using only one dimension of a multi-dimensional data set. The main aim of this work is to utilize the strengths of the self organizing map (SOM) to overcome the inadvertent dimensionality reduction resulting from using only the Bayes formula to classify. Combining the hybrid system with new ranking techniques further improves the performance of the proposed document classification approach. This work describes the implementation of an enhanced hybrid classification approach which affords a better classification accuracy through the utilization of two familiar algorithms, the naive Bayes classification algorithm which is used to vectorize the document using a probability distribution and the self organizing map (SOM) clustering algorithm which is used as the multi-dimensional unsupervised classifier.