Unified linear subspace approach to semantic analysis

Authors:
Dandan Li;Chung-Ping Kwong;Dik Lun Lee
Affiliations:
Department of Computer Science and Engineering, Hong Kong University of Science & Technology, Hong Kong;Department of Mechanical and Automation Engineering, Chinese University of Hong Kong, Hong Kong;Department of Computer Science and Engineering, Hong Kong University of Science & Technology, Hong Kong
Venue:
Journal of the American Society for Information Science and Technology
Year:
2010

Citing 25
Cited 1

The vocabulary problem in human-system communication

Communications of the ACM
Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Using linear algebra for intelligent information retrieval

SIAM Review
Generalized vector spaces model in information retrieval

SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Large-scale information retrieval with latent semantic indexing

Information Sciences: an International Journal
A semidiscrete matrix decomposition for latent semantic indexing information retrieval

ACM Transactions on Information Systems (TOIS)
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A similarity-based probability model for latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Matrices, Vector Spaces, and Information Retrieval

SIAM Review
Latent semantic indexing: a probabilistic analysis

Journal of Computer and System Sciences - Special issue on the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems
A vector space model for automatic indexing

Communications of the ACM
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Latent Semantic Kernels

Journal of Intelligent Information Systems
Taking a new look at the latent semantic analysis approach to information retrieval

Computational information retrieval
Latent concepts and the number orthogonal factors in latent semantic analysis

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications

Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications
Kernel Methods for Pattern Analysis

Kernel Methods for Pattern Analysis
Choosing the word most typical in context using a lexical co-occurrence network

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A probabilistic model for Latent Semantic Indexing: Research Articles

Journal of the American Society for Information Science and Technology
Why spectral retrieval works

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
The SMART Retrieval System—Experiments in Automatic Document Processing

The SMART Retrieval System—Experiments in Automatic Document Processing
Introduction to Information Retrieval

Introduction to Information Retrieval
Geometric and topological approaches to semantic text retrieval

Geometric and topological approaches to semantic text retrieval
A framework for understanding Latent Semantic Indexing (LSI) performance

Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval

Understanding latent semantic indexing: A topological structure analysis using Q-analysis

Journal of the American Society for Information Science and Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Basic Vector Space Model (BVSM) is well known in information retrieval. Unfortunately, its retrieval effectiveness is limited because it is based on literal term matching. The Generalized Vector Space Model (GVSM) and Latent Semantic Indexing (LSI) are two prominent semantic retrieval methods, both of which assume there is some underlying latent semantic structure in a dataset that can be used to improve retrieval performance. However, while this structure may be derived from both the term space and the document space, GVSM exploits only the former and LSI the latter. In this article, the latent semantic structure of a dataset is examined from a dual perspective; namely, we consider the term space and the document space simultaneously. This new viewpoint has a natural connection to the notion of kernels. Specifically, a unified kernel function can be derived for a class of vector space models. The dual perspective provides a deeper understanding of the semantic space and makes transparent the geometrical meaning of the unified kernel function. New semantic analysis methods based on the unified kernel function are developed, which combine the advantages of LSI and GVSM. We also prove that the new methods are stable because although the selected rank of the truncated Singular Value Decomposition (SVD) is far from the optimum, the retrieval performance will not be degraded significantly. Experiments performed on standard test collections show that our methods are promising. © 2010 Wiley Periodicals, Inc.