Hypothesizing word association from untagged text

Authors:
Tomoyoshi Matsukawa
Affiliations:
BBN Systems and Technologies, Cambridge, MA
Venue:
HLT '93 Proceedings of the workshop on Human Language Technology
Year:
1993

Citing 9
Cited 5

Word association norms, mutual information, and lexicography

Computational Linguistics
Partial parsing: a report on work in progress

HLT '91 Proceedings of the workshop on Speech and Natural Language
Class-based n-gram models of natural language

Computational Linguistics
Development of the Concept Dictionary - Implementation of Lexical Knowledge

Proceedings of the First SIGLEX Workshop on Lexical Semantics and Knowledge Representation
Automatic acquisition of subcategorization frames from untagged text

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Automatically extracting and representing collocations for language generation

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Noun classification from predicate-argument structures

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
A class-based approach to lexical discovery

ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Example-based correction of word segmentation and part of speech labelling

HLT '93 Proceedings of the workshop on Human Language Technology

Improving statistical language model performance with automatically generated word hierarchies

Computational Linguistics
Towards the automatic identification of adjectival scales: clustering adjectives according to meaning

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
BEN: description of the PLUM system as used for MUC-6

MUC6 '95 Proceedings of the 6th conference on Message understanding
Progress in information extraction

TIPSTER '96 Proceedings of a workshop on held at Vienna, Virginia: May 6-8, 1996
BBN's PLUM Probabilistic Language Understanding system

TIPSTER '93 Proceedings of a workshop on held at Fredericksburg, Virginia: September 19-23, 1993

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper reports a new method for suggesting word associations, based on a greedy algorithm that employs Chi-square statistics on joint frequencies of pairs of word groups compared against chance co-occurrence. The benefits of this new approach are: 1) we can consider even low frequency words and word pairs, and 2) word groups and word associations can be automatically generated. The method provided 87% accuracy in hypothesizing word associations for unobserved combinations of words in Japanese text.