Latent semantic indexing is an optimal special case of multidimensional scaling
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Latent semantic indexing: a probabilistic analysis
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Latent semantic space: iterative scaling improves precision of inter-document similarity measurement
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares
ACM Transactions on Mathematical Software (TOMS)
Algorithm 583: LSQR: Sparse Linear Equations and Least Squares Problems
ACM Transactions on Mathematical Software (TOMS)
Matrix algorithms
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Document clustering based on non-negative matrix factorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Locality preserving indexing for document representation
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Orthogonal locality preserving indexing
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Document Clustering Using Locality Preserving Indexing
IEEE Transactions on Knowledge and Data Engineering
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
SRDA: An Efficient Algorithm for Large-Scale Discriminant Analysis
IEEE Transactions on Knowledge and Data Engineering
Optimal regularization parameter estimation for spectral regression discriminant analysis
IEEE Transactions on Circuits and Systems for Video Technology
Symbolic representation of text documents
Proceedings of the Third Annual ACM Bangalore Conference
Capturing nonlinear structure in word spaces through dimensionality reduction
GEMS '10 Proceedings of the 2010 Workshop on GEometrical Models of Natural Language Semantics
Localized twin SVM via convex minimization
Neurocomputing
A comprehensive approach to image spam detection: from server to client solution
IEEE Transactions on Information Forensics and Security
Speed up kernel discriminant analysis
The VLDB Journal — The International Journal on Very Large Data Bases
Cluster based symbolic representation and feature selection for text classification
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
A symbolic approach for text classification based on dissimilarity measure
Proceedings of the First International Conference on Intelligent Interactive Technologies and Multimedia
Dissimilarity based feature selection for text classification: a cluster based approach
Proceedings of the International Conference & Workshop on Emerging Trends in Technology
Learning hash functions for cross-view similarity search
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Text classification using symbolic similarity measure
Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
An empirical study on various text classifiers
Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
Personal and Ubiquitous Computing
Hi-index | 0.00 |
We consider the problem of document indexing and representation. Recently, Locality Preserving Indexing (LPI) was proposed for learning a compact document subspace. Different from Latent Semantic Indexing (LSI) which is optimal in the sense of global Euclidean structure, LPI is optimal in the sense of local manifold structure. However, LPI is not efficient in time and memory which makes it difficult to be applied to very large data set. Specifically, the computation of LPI involves eigen-decompositions of two dense matrices which is expensive. In this paper, we propose a new algorithm called Regularized Locality Preserving Indexing (RLPI). Benefit from recent progresses on spectral graph analysis, we cast the original LPI algorithm into a regression framework which enable us to avoid eigen-decomposition of dense matrices. Also, with the regression based framework, different kinds of regularizers can be naturally incorporated into our algorithm which makes it more flexible. Extensive experimental results show that RLPI obtains similar or better results comparing to LPI and it is significantly faster, which makes it an efficient and effective data preprocessing method for large scale text clustering, classification and retrieval.