Word association norms, mutual information, and lexicography
Computational Linguistics
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
The KDD process for extracting useful knowledge from volumes of data
Communications of the ACM
MARSYAS: a framework for audio analysis
Organised Sound
Hi-index | 0.00 |
Occurence patterns of words in documents can be expressed as binary vectors. When two vectors are similar, the two words corresponding to the vectors may have some implicit relationship with each other. We call these two words a correlated pair. This report describes a method for obtaining the most highly correlated pairs of a given size. In practice, the method requires O(N x log(N)) computation time, and O(N) memory space, where N is the number of documents or records. Since this does not depend on the size of the vocabulary under analysis, it is possible to compute correlations between all the words in a corpus.