Class-based n-gram models of natural language
Computational Linguistics
Improving statistical language model performance with automatically generated word hierarchies
Computational Linguistics
Automatic thesaurus construction using Bayesian networks
Information Processing and Management: an International Journal - Special issue: history of information science
Statistical methods for speech recognition
Statistical methods for speech recognition
Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Algorithms for bigram and trigram word clustering
Speech Communication
Speech recognition: theory and C++ implementation
Speech recognition: theory and C++ implementation
ACM Computing Surveys (CSUR)
Unsupervised learning by probabilistic latent semantic analysis
Machine Learning
Induction of semantic classes from natural language text
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Explorations in Automatic Thesaurus Discovery
Explorations in Automatic Thesaurus Discovery
Clustering Validity Assessment: Finding the Optimal Partitioning of a Data Set
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Enhanced word clustering for hierarchical text classification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering word senses from text
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Distant Bigram Language Modelling Using Maximum Entropy
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Word clustering and disambiguation based on co-occurrence data
Natural Language Engineering
A simple approach to building ensembles of Naive Bayesian classifiers for word sense disambiguation
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Using syntactic dependency as local context to resolve word sense ambiguity
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Exploring asymmetric clustering for statistical language modeling
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Parallel Spectral Clustering in Distributed Systems
IEEE Transactions on Pattern Analysis and Machine Intelligence
Automatic word clustering for text categorization using global information
AIRS'04 Proceedings of the 2004 international conference on Asian Information Retrieval Technology
Computational Linguistics
Hierarchical verb clustering using graph factorization
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.01 |
Two novel word clustering techniques are proposed which employ long distance bigram language models. The first technique is built on a hierarchical clustering algorithm and minimizes the sum of Mahalanobis distances of all words after a cluster merger from the centroid of the class created by merging. The second technique resorts to probabilistic latent semantic analysis (PLSA). Next, interpolated long distance bigrams are considered in the context of the aforementioned clustering techniques. Experiments conducted on the English Gigaword corpus (second edition) demonstrate that: (1) the long distance bigrams, when employed in the two clustering techniques under study, yield word clusters of better quality than the baseline bigrams; (2) the interpolated long distance bigrams outperform the long distance bigrams in the same respect; (3) the long distance bigrams perform better than the bigrams, which incorporate trigger-pairs selected at various distances; and (4) the best word clustering is achieved by the PLSA that employs interpolated long distance bigrams. Both proposed techniques outperform spectral clustering based on k-means. To assess objectively the quality of the created clusters, relative cluster validity indices are estimated as well as the average cluster sense precision, the average cluster sense recall, and the F-measure are computed by exploiting ground truth extracted from the WordNet.