Chinese Text Summarization Using a Trainable Summarizer and Latent Semantic Analysis
ICADL '02 Proceedings of the 5th International Conference on Asian Digital Libraries: Digital Libraries: People, Knowledge, and Technology
Topic-based mixture language modelling
Natural Language Engineering
Text summarization using a trainable summarizer and latent semantic analysis
Information Processing and Management: an International Journal - Special issue: An Asian digital libraries perspective
Exploring asymmetric clustering for statistical language modeling
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Proceedings of the 43rd annual Southeast regional conference - Volume 1
Similarity based smoothing in language modeling
Acta Cybernetica
Expert Systems with Applications: An International Journal
Sequence prediction exploiting similarity information
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Topic-Dependent Language Model with Voting on Noun History
ACM Transactions on Asian Language Information Processing (TALIP)
IPSILON: incremental parsing for semantic indexing of latent concepts
IEEE Transactions on Image Processing
Constructing task-specific taxonomies for document collection browsing
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Resolving task specification and path inconsistency in taxonomy construction
Proceedings of the 3rd Workshop on the People's Web Meets NLP: Collaboratively Constructed Semantic Resources and their Applications to NLP
On the dynamic adaptation of language models based on dialogue information
Expert Systems with Applications: An International Journal
Semantic spaces for improving language modeling
Computer Speech and Language
Latent Semantic Analysis for Multimodal User Input With Speech and Gestures
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
A new approach is proposed for the clustering of words in a given vocabulary. The method is based on a paradigm first formulated in the context of information retrieval, called latent semantic analysis. This paradigm leads to a parsimonious vector representation of each word in a suitable vector space, where familiar clustering techniques can be applied. The distance measure selected in this space arises naturally from the problem formulation. Preliminary experiments indicate that, the clusters produced are intuitively satisfactory. Because these clusters are semantic in nature, this approach may prove useful as a complement to conventional class-based statistical language modeling techniques.