Improving text retrieval for the routing problem using latent semantic indexing
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Incorporating latent semantic indexing into a neural network model for information retrieval
CIKM '96 Proceedings of the fifth international conference on Information and knowledge management
Translingual information retrieval: learning from bilingual corpora
Artificial Intelligence - Special issue: artificial intelligence 40 years later
Approximate Dimension Equalization in Vector-based Information Retrieval
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Taking a new look at the latent semantic analysis approach to information retrieval
Computational information retrieval
On the use of the singular value decomposition for text retrieval
Computational information retrieval
Experiments with LSA scoring: optimal rank and basis
Computational information retrieval
A comparative analysis of LSI strategies
Computational information retrieval
Cross-Language Information Retrieval Using Latent Semantic Indexing
Cross-Language Information Retrieval Using Latent Semantic Indexing
Measuring praise and criticism: Inference of semantic orientation from association
ACM Transactions on Information Systems (TOIS)
Using latent semantic indexing to filter spam
Proceedings of the 2003 ACM symposium on Applied computing
Locality preserving indexing for document representation
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
SVM-Based feature selection of latent semantic features
Pattern Recognition Letters
Eigenvalue-based model selection during latent semantic indexing: Research Articles
Journal of the American Society for Information Science and Technology
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Multi-label informed latent semantic indexing
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Disambiguating noun compounds with latent semantic indexing
COMPUTERM '02 COLING-02 on COMPUTERM 2002: second international workshop on computational terminology - Volume 14
Information Technology & Lawyers: Advanced technology in the legal domain, from challenges to daily routine
A framework for understanding latent semantic indexing (LSI) performance
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Advanced learning algorithms for cross-language patent retrieval and classification
Information Processing and Management: an International Journal
Essential Dimensions of Latent Semantic Indexing (LSI)
HICSS '07 Proceedings of the 40th Annual Hawaii International Conference on System Sciences
Automatic dimensionality selection from the scree plot via the use of profile likelihood
Computational Statistics & Data Analysis
PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Application of latent semantic indexing in generating graphs of terrorist networks
ISI'06 Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics
Semi-automatic construction of topic ontologies
EWMF'05/KDO'05 Proceedings of the 2005 joint international conference on Semantics, Web and Mining
Wise search engine based on LSI
ADMI'10 Proceedings of the 6th international conference on Agents and data mining interaction
Comparability of LSI and human judgment in text analysis tasks
MMACTEE'09 Proceedings of the 11th WSEAS international conference on Mathematical methods and computational techniques in electrical engineering
Latent semantic indexing (LSI) fails for TREC collections
ACM SIGKDD Explorations Newsletter
Is singular value decomposition useful for word similarity extraction?
Language Resources and Evaluation
Implementation techniques for large-scale latent semantic indexing applications
Proceedings of the 20th ACM international conference on Information and knowledge management
Selecting corpus-semantic models for neurolinguistic decoding
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Hi-index | 0.00 |
The technique of latent semantic indexing is used in a wide variety of commercial applications. In these applications, the processing time and RAM required for SVD computation, and the processing time and RAM required during LSI retrieval operations are all roughly linear in the number of dimensions, k, chosen for the LSI representation space. In large-scale commercial LSI applications, reducing k values could be of significant value in reducing server costs. This paper explores the effects of varying dimensionality. The approach taken here focuses on term comparisons. Pairs of terms are considered which have strong real-world associations. The proximities of members of these pairs in the LSI space are compared at multiple values of k. The testing is carried out for collections of from one to five million documents. For the five million document collection, a value of k ≈ 400 provides the best performance. The results suggest that there is something of an 'island of stability' in the k = 300 to 500 range. The results also indicate that there is relatively little room to employ k values outside of this range without incurring significant distortions in at least some term-term correlations.