Text classification: A least square support vector machine approach

Authors:
Vikramjit Mitra;Chia-Jiu Wang;Satarupa Banerjee
Affiliations:
ECE Department, University of Maryland, College Park, MD, United States;ECE Department, University of Colorado Colorado Springs, CO, United States;CS Department Villanova University Villanova, PA, United States
Venue:
Applied Soft Computing
Year:
2007

Citing 13
Cited 23

Using latent semantic indexing for information filtering

COCS '90 Proceedings of the ACM SIGOIS and IEEE CS TC-OA conference on Office information systems
The nature of statistical learning theory

The nature of statistical learning theory
A comparison of classifiers and document representations for the routing problem

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Nonlinear component analysis as a kernel eigenvalue problem

Neural Computation
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Least Squares Support Vector Machine Classifiers

Neural Processing Letters
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
An Evaluation of Statistical Approaches to Text Categorization

Information Retrieval
Latent Semantic Kernels

Journal of Intelligent Information Systems
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Benchmarking Least Squares Support Vector Machine Classifiers

Machine Learning
Feature selection for text categorization on imbalanced data

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets

Parameter determination of support vector machine and feature selection using simulated annealing approach

Applied Soft Computing
A novel LS-SVMs hyper-parameter selection based on particle swarm optimization

Neurocomputing
Similarity computing model of high dimension data for symptom classification of Chinese traditional medicine

Applied Soft Computing
A soft computing model for mapping incomplete/approximate postal addresses to mail delivery points

Applied Soft Computing
Combination of modified BPNN algorithms and an efficient feature selection method for text categorization

Information Processing and Management: an International Journal
An automatically constructed thesaurus for neural network based document categorization

Expert Systems with Applications: An International Journal
Symbolic representation of text documents

Proceedings of the Third Annual ACM Bangalore Conference
Patent classification system using a new hybrid genetic algorithm support vector machine

Applied Soft Computing
Hybrid robust support vector machines for regression with outliers

Applied Soft Computing
Application of data mining in multi-geological-factor analysis

ISICA'10 Proceedings of the 5th international conference on Advances in computation and intelligence
Fuzzy evolutionary optimization modeling and its applications to unsupervised categorization and extractive summarization

Expert Systems with Applications: An International Journal
Electrical evoked potentials prediction model in visual prostheses based on support vector regression with multiple weights

Applied Soft Computing
Measuring financial risk with generalized asymmetric least squares regression

Applied Soft Computing
Handedness tests for preschool children: A novel approach based on graphics tablets and support vector machines

Applied Soft Computing
An improved plagiarism detection scheme based on semantic role labeling

Applied Soft Computing
Evolution strategy based adaptive Lq penalty support vector machines with Gauss kernel for credit risk analysis

Applied Soft Computing
Evaluation of normalization techniques in text classification for portuguese

ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part III
Rough sets for spam filtering: Selecting appropriate decision rules for boundary e-mail classification

Applied Soft Computing
Adaptive SVM-Based classification systems based on the improved endocrine-based PSO algorithm

AMT'12 Proceedings of the 8th international conference on Active Media Technology
The Effect of Stemming on Arabic Text Classification: An Empirical Study

International Journal of Information Retrieval Research
A tensor factorization based least squares support tensor machine for classification

ISNN'13 Proceedings of the 10th international conference on Advances in Neural Networks - Volume Part I
Letters: A novel online adaptive kernel method with kernel centers determined by a support vector regression approach

Neurocomputing
An Embedded Co-AdaBoost based construction of software document relation coupled resource spaces for cyber-physical society

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a least square support vector machine (LS-SVM) that performs text classification of noisy document titles according to different predetermined categories. The system's potential is demonstrated with a corpus of 91,229 words from University of Denver's Penrose Library catalogue. The classification accuracy of the proposed LS-SVM based system is found to be over 99.9%. The final classifier is an LS-SVM array with Gaussian radial basis function (GRBF) kernel, which uses the coefficients generated by the latent semantic indexing algorithm for classification of the text titles. These coefficients are also used to generate the confidence factors for the inference engine that present the final decision of the entire classifier. The system is also compared with a K-nearest neighbor (KNN) and Naive Bayes (NB) classifier and the comparison clearly claims that the proposed LS-SVM based architecture outperforms the KNN and NB based system. The comparison between the conventional linear SVM based classifiers and neural network based classifying agents shows that the LS-SVM with LSI based classifying agents improves text categorization performance significantly and holds a lot of potential for developing robust learning based agents for text classification.