A novel word clustering algorithm based on latent semantic analysis

Authors:
J. R. Bellegarda;J. W. Butzberger;Yen-Lu Chow;N. B. Coccaro;D. Naik
Affiliations:
Interactive Media Group, Apple Comput. Inc., Cupertino, CA, USA;Interactive Media Group, Apple Comput. Inc., Cupertino, CA, USA;Interactive Media Group, Apple Comput. Inc., Cupertino, CA, USA;Interactive Media Group, Apple Comput. Inc., Cupertino, CA, USA;Interactive Media Group, Apple Comput. Inc., Cupertino, CA, USA
Venue:
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Year:
1996

Citing 0
Cited 15

Chinese Text Summarization Using a Trainable Summarizer and Latent Semantic Analysis

ICADL '02 Proceedings of the 5th International Conference on Asian Digital Libraries: Digital Libraries: People, Knowledge, and Technology
Topic-based mixture language modelling

Natural Language Engineering
Text summarization using a trainable summarizer and latent semantic analysis

Information Processing and Management: an International Journal - Special issue: An Asian digital libraries perspective
Exploring asymmetric clustering for statistical language modeling

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Understanding without formality: augmenting speech recognition to understand informal verbal commands

Proceedings of the 43rd annual Southeast regional conference - Volume 1
Similarity based smoothing in language modeling

Acta Cybernetica
Genetic algorithm for text clustering using ontology and evaluating the validity of various semantic similarity measures

Expert Systems with Applications: An International Journal
Sequence prediction exploiting similarity information

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Topic-Dependent Language Model with Voting on Noun History

ACM Transactions on Asian Language Information Processing (TALIP)
IPSILON: incremental parsing for semantic indexing of latent concepts

IEEE Transactions on Image Processing
Constructing task-specific taxonomies for document collection browsing

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Resolving task specification and path inconsistency in taxonomy construction

Proceedings of the 3rd Workshop on the People's Web Meets NLP: Collaboratively Constructed Semantic Resources and their Applications to NLP
On the dynamic adaptation of language models based on dialogue information

Expert Systems with Applications: An International Journal
Semantic spaces for improving language modeling

Computer Speech and Language
Latent Semantic Analysis for Multimodal User Input With Speech and Gestures

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

A new approach is proposed for the clustering of words in a given vocabulary. The method is based on a paradigm first formulated in the context of information retrieval, called latent semantic analysis. This paradigm leads to a parsimonious vector representation of each word in a suitable vector space, where familiar clustering techniques can be applied. The distance measure selected in this space arises naturally from the problem formulation. Preliminary experiments indicate that, the clusters produced are intuitively satisfactory. Because these clusters are semantic in nature, this approach may prove useful as a complement to conventional class-based statistical language modeling techniques.