Multiprocessor sparse SVD algorithms and applications
Multiprocessor sparse SVD algorithms and applications
Jacobi's method is more accurage than QR
SIAM Journal on Matrix Analysis and Applications
A parallel ring ordering algorithm for efficient one-sided Jacobi SVD computations
Journal of Parallel and Distributed Computing
Latent semantic indexing: a probabilistic analysis
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Accurate Computation of the Product-Induced Singular Value Decomposition with Applications
SIAM Journal on Numerical Analysis
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Latent semantic space: iterative scaling improves precision of inter-document similarity measurement
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the ninth international conference on Information and knowledge management
Concept decompositions for large sparse text data using clustering
Machine Learning
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
An Efficient Jacobi-Like Algorithm for Parallel Eigenvalue Computation
IEEE Transactions on Computers
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
On variable blocking factor in a parallel dynamic block: Jacobi SVD algorithm
Parallel Computing - Parallel matrix algorithms and applications (PMAA '02)
SVD based Term Suggestion and Ranking System
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Dimension Reduction in Text Classification with Support Vector Machines
The Journal of Machine Learning Research
ACM Transactions on Information Systems (TOIS)
Augmenting the power of LSI in text retrieval: Singular value rescaling
Data & Knowledge Engineering
New Fast and Accurate Jacobi SVD Algorithm. II
SIAM Journal on Matrix Analysis and Applications
An information granulation based data mining approach for classifying imbalanced data
Information Sciences: an International Journal
Web page classification: Features and algorithms
ACM Computing Surveys (CSUR)
A class-feature-centroid classifier for text categorization
Proceedings of the 18th international conference on World wide web
Lanczos Vectors versus Singular Vectors for Effective Dimension Reduction
IEEE Transactions on Knowledge and Data Engineering
Communications of the ACM
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Feature selection strategies for text categorization
AI'03 Proceedings of the 16th Canadian society for computational studies of intelligence conference on Advances in artificial intelligence
Fast dimension reduction for document classification based on imprecise spectrum analysis
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A novel scheme for the parallel computation of SVDs
HPCC'06 Proceedings of the Second international conference on High Performance Computing and Communications
Weighted average pointwise mutual information for feature selection in text categorization
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Improving linear discriminant analysis with artificial immune system-based evolutionary algorithms
Information Sciences: an International Journal
Feature extraction using a fast null space based linear discriminant analysis algorithm
Information Sciences: an International Journal
Feature selection using structural similarity
Information Sciences: an International Journal
Multi-view learning via probabilistic latent semantic analysis
Information Sciences: an International Journal
A block JRS algorithm for highly parallel computation of SVDs
HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
Automatic field data analyzer for closed-loop vehicle design
Information Sciences: an International Journal
Hi-index | 0.07 |
Latent Semantic Indexing (LSI) with Singular Value Decomposition (SVD) is an effective dimension reduction method for document classification and other information analysis tasks. The computational overhead of SVD is known to be a bottleneck in dealing with large data sets, and faster dimension reduction with competitive accuracy is desired in such a setting. This paper presents Imprecise Spectrum Analysis (ISA) to carry out fast dimension reduction for document classification. ISA follows the one-sided Jacobi method for computing SVD and simplifies its intensive orthogonality computation. It uses a representative matrix composed of top-k column vectors derived from the original feature vector space and reduces the dimension of a feature vector by computing its product with this representative matrix. The paper provides an analysis to show the approximation error and the rationale behind such a dimension reduction method. To further improve classification accuracy, this paper also presents a feature selection method in building the initial feature matrix and augments the representative matrix by including centroid vectors. Our extensive experimental results show that ISA is fast in handling large term-document feature matrices while delivering better or competitive classification accuracy for the tested benchmarks compared to LSI with SVD.