Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Support Vector Machines for Text Categorization
HICSS '03 Proceedings of the 36th Annual Hawaii International Conference on System Sciences (HICSS'03) - Track 4 - Volume 4
Text classification using string kernels
The Journal of Machine Learning Research
The Journal of Machine Learning Research
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Extracting word sequence correspondences with support vector machines
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Text document clustering based on frequent word sequences
Proceedings of the 14th ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Traditional bag-of-words model and recent word-sequence kernel are two well-known techniques in the field of text categorization. Bag-of-words representation neglects the word order, which could result in less computation accuracy for some types of documents. Word-sequence kernel takes into account word order, but does not include all information of the word frequency. A weighted kernel model that combines these two models was proposed by the authors [1]. This paper is focused on the optimization of the weighting parameters, which are functions of word frequency. Experiments have been conducted with Reuter's database and show that the new weighted kernel achieves better classification accuracy.