Ten lectures on wavelets
Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Essential wavelets for statistical applications and data analysis
Essential wavelets for statistical applications and data analysis
Supporting content retrieval from WWW via “basic level categories” (poster abstract)
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
The feature quantity: an information theoretic perspective of Tfidf-like measures
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Automatic Information Organization and Retrieval.
Automatic Information Organization and Retrieval.
Unitary equivalence: a new twist on signal processing
IEEE Transactions on Signal Processing
The document as an ergodic markov chain
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating semantic indexing techniques through cross-language fingerprinting
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Live visual relevance feedback for query formulation
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Design Of The Narrator System: Processing, Storing And Retrieving Medical Narrative Data
Journal of Integrated Design & Process Science - Applications of formal methods
Improving text classification by a sense spectrum approach to term expansion
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
A Kernel-based feature weighting for text classification
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Understanding latent semantic indexing: A topological structure analysis using Q-analysis
Journal of the American Society for Information Science and Technology
Trading spaces: on the lore and limitations of latent semantic analysis
ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
A fingerprinting technique for evaluating semantics based indexing
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Hi-index | 0.00 |
When people search for documents, they eventually want content, not words. Hence, search engines should relate documents more by their underlying concepts than by the words they contain. One promising technique to do so is Latent Semantic Indexing (LSI). LSI dramatically reduces the dimension of the document space by mapping it into a space spanned by conceptual indices. Empirically, the number of concepts that can represent the documents are far fewer than the great variety of words in the textual representation. Although this almost obviates the problem of lexical matching, the mapping incurs a high computational cost compared to document parsing, indexing, query matching, and updating. This article accomplishes several things. First, it shows how the technique underlying LSI is just one example of a unitary operator, for which there are computationally more attractive alternatives. Second, it proposes the Haar transform as such an alternative, as it is memory efficient, and can be computed in linear to sublinear time. Third, it generalizes LSI by a multiresolution representation of the document space. The approach not only preserves the advantages of LSI at drastically reduced computational costs, it also opens a spectrum of possibilities for new research.