Automated text content identification for document processing using a kernel-based support vector selection approach

Authors:
Steven M. Benveniste;Monique P. Fargues
Affiliations:
ECE Department, Naval Postgraduate School, Monterey, CA;ECE Department, Naval Postgraduate School, Monterey, CA
Venue:
Asilomar'09 Proceedings of the 43rd Asilomar conference on Signals, systems and computers
Year:
2009

Citing 3
Cited 0

Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms

Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Generalized Discriminant Analysis Using a Kernel Approach

Neural Computation
Introduction to Information Retrieval

Introduction to Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automated text analysis and mining tools designed to identify the main topics of texts, chat room discussions, and web postings are an increasingly active research area due to the rapid explosion of Web information. This paper applies the nonlinear kernel-based Feature Vector Selection (FVS) approach followed by a Linear Discriminant Analysis (LDA) step to categorize unstructured text documents. Results are compared to those obtained using the Latent Semantic Analysis (LSA) approach commonly used in text categorization applications. Overall results, taking into account classification performances and computational load issues, show that the FVS-LDA implemented with a polynomial kernel of degree 1 and an added constant of 1 to be the best classifier for the database considered.