Regularized locality preserving indexing via spectral regression

Authors:
Deng Cai;Xiaofei He;Wei Vivian Zhang;Jiawei Han
Affiliations:
University of Illinois at Urbana Champaign, Urbana, IL;Yahoo! Inc., Burbank, CA;Yahoo! Inc., Burbank, CA;University of Illinois at Urbana Champaign, Urbana, IL
Venue:
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Year:
2007

Citing 15
Cited 17

Latent semantic indexing is an optimal special case of multidimensional scaling

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Latent semantic indexing: a probabilistic analysis

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Latent semantic space: iterative scaling improves precision of inter-document similarity measurement

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares

ACM Transactions on Mathematical Software (TOMS)
Algorithm 583: LSQR: Sparse Linear Equations and Least Squares Problems

ACM Transactions on Mathematical Software (TOMS)
Matrix algorithms

Matrix algorithms
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Document clustering based on non-negative matrix factorization

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Locality preserving indexing for document representation

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Orthogonal locality preserving indexing

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Document Clustering Using Locality Preserving Indexing

IEEE Transactions on Knowledge and Data Engineering
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)

SRDA: An Efficient Algorithm for Large-Scale Discriminant Analysis

IEEE Transactions on Knowledge and Data Engineering
Optimal regularization parameter estimation for spectral regression discriminant analysis

IEEE Transactions on Circuits and Systems for Video Technology
Symbolic representation of text documents

Proceedings of the Third Annual ACM Bangalore Conference
Capturing nonlinear structure in word spaces through dimensionality reduction

GEMS '10 Proceedings of the 2010 Workshop on GEometrical Models of Natural Language Semantics
Sample-dependent graph construction with application to dimensionality reduction

Neurocomputing
Localized twin SVM via convex minimization

Neurocomputing
A comprehensive approach to image spam detection: from server to client solution

IEEE Transactions on Information Forensics and Security
Speed up kernel discriminant analysis

The VLDB Journal — The International Journal on Very Large Data Bases
Cluster based symbolic representation and feature selection for text classification

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
A symbolic approach for text classification based on dissimilarity measure

Proceedings of the First International Conference on Intelligent Interactive Technologies and Multimedia
Dissimilarity based feature selection for text classification: a cluster based approach

Proceedings of the International Conference & Workshop on Emerging Trends in Technology
Learning hash functions for cross-view similarity search

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Weighted Twin Support Vector Machines with Local Information and its application

Neural Networks
Text classification using symbolic similarity measure

Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
An empirical study on various text classifiers

Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
Exploiting visual quasi-periodicity for real-time chewing event detection using active appearance models and support vector machines

Personal and Ubiquitous Computing
Letters: Normalized discriminant analysis for dimensionality reduction

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the problem of document indexing and representation. Recently, Locality Preserving Indexing (LPI) was proposed for learning a compact document subspace. Different from Latent Semantic Indexing (LSI) which is optimal in the sense of global Euclidean structure, LPI is optimal in the sense of local manifold structure. However, LPI is not efficient in time and memory which makes it difficult to be applied to very large data set. Specifically, the computation of LPI involves eigen-decompositions of two dense matrices which is expensive. In this paper, we propose a new algorithm called Regularized Locality Preserving Indexing (RLPI). Benefit from recent progresses on spectral graph analysis, we cast the original LPI algorithm into a regression framework which enable us to avoid eigen-decomposition of dense matrices. Also, with the regression based framework, different kinds of regularizers can be naturally incorporated into our algorithm which makes it more flexible. Extensive experimental results show that RLPI obtains similar or better results comparing to LPI and it is significantly faster, which makes it an efficient and effective data preprocessing method for large scale text clustering, classification and retrieval.