Disambiguating noun compounds with latent semantic indexing

Authors:
Alan M. Buckeridge;Richard F. E. Sutcliffe
Affiliations:
University of Limerick, Limerick, Ireland;University of Limerick, Limerick, Ireland
Venue:
COMPUTERM '02 COLING-02 on COMPUTERM 2002: second international workshop on computational terminology - Volume 14
Year:
2002

Citing 8
Cited 4

Selection and information: a class-based approach to lexical relationships

Selection and information: a class-based approach to lexical relationships
Some advances in transformation-based part of speech tagging

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Computational Methods for Intelligent Information Access

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Theory of Syntactic Recognition for Natural Languages

Theory of Syntactic Recognition for Natural Languages
A Trainable Bracketer for Noun Modifiers

AI '98 Proceedings of the 12th Biennial Conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Lexical semantic techniques for corpus analysis

Computational Linguistics - Special issue on using large corpora: II
Noun-phrase analysis in unrestricted text for information retrieval

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Information retrieval using robust natural language processing

ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics

An empirical study of required dimensionality for large-scale latent semantic indexing applications

Proceedings of the 17th ACM conference on Information and knowledge management
Determining the syntactic structure of medical terms in clinical notes

BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
Assessing the readability of clinical documents in a document engineering environment

Proceedings of the 10th ACM symposium on Document engineering
Using semantic techniques to access web data

Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Technical terms in text often appear as noun compounds, a frequently occurring yet highly ambiguous construction whose interpretation relies on extra-syntactic information. Several statistical methods for disambiguating compounds have been reported in the literature, often with quite impressive results. However, a striking feature of all these approaches is that they rely on the existence of previously seen unambiguous compounds, meaning they are prone to the problem of sparse data. This difficulty has been overcome somewhat through the use of hand-crafted knowledge resources to collect statistics on "concepts" rather than noun tokens, but domain-independence has been sacrificed by doing so. We report here on work investigating the application of Latent Semantic Indexing to provide a robust domain-independent source of the extra-syntactic knowledge necessary for noun compound disambiguation.