Latent semantic indexing: a probabilistic analysis
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Database-friendly random projections
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Explorations in Automatic Thesaurus Discovery
Explorations in Automatic Thesaurus Discovery
A non-projective dependency parser
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
A statistical approach to language translation
COLING '88 Proceedings of the 12th conference on Computational linguistics - Volume 1
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Automatic bilingual lexicon acquisition using random indexing of parallel corpora
Natural Language Engineering
Scaling distributional similarity to large corpora
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Methodological Review: Empirical distributional semantics: Methods and biomedical applications
Journal of Biomedical Informatics
A random indexing approach for web user clustering and web prefetching
PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Hi-index | 0.00 |
Random Indexing is a vector space technique that provides an efficient and scalable approximation to distributional similarity problems. We present experiments showing Random Indexing to be poor at handling large volumes of data and evaluate the use of weighting functions for improving the performance of Random Indexing. We find that Random Index is robust for small data sets, but performance degrades because of the influence high frequency attributes in large data sets. The use of appropriate weight functions improves this significantly.