Selection and information: a class-based approach to lexical relationships
Selection and information: a class-based approach to lexical relationships
Some advances in transformation-based part of speech tagging
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Computational Methods for Intelligent Information Access
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Theory of Syntactic Recognition for Natural Languages
Theory of Syntactic Recognition for Natural Languages
A Trainable Bracketer for Noun Modifiers
AI '98 Proceedings of the 12th Biennial Conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Lexical semantic techniques for corpus analysis
Computational Linguistics - Special issue on using large corpora: II
Noun-phrase analysis in unrestricted text for information retrieval
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Information retrieval using robust natural language processing
ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
An empirical study of required dimensionality for large-scale latent semantic indexing applications
Proceedings of the 17th ACM conference on Information and knowledge management
Determining the syntactic structure of medical terms in clinical notes
BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
Assessing the readability of clinical documents in a document engineering environment
Proceedings of the 10th ACM symposium on Document engineering
Using semantic techniques to access web data
Information Systems
Hi-index | 0.00 |
Technical terms in text often appear as noun compounds, a frequently occurring yet highly ambiguous construction whose interpretation relies on extra-syntactic information. Several statistical methods for disambiguating compounds have been reported in the literature, often with quite impressive results. However, a striking feature of all these approaches is that they rely on the existence of previously seen unambiguous compounds, meaning they are prone to the problem of sparse data. This difficulty has been overcome somewhat through the use of hand-crafted knowledge resources to collect statistics on "concepts" rather than noun tokens, but domain-independence has been sacrificed by doing so. We report here on work investigating the application of Latent Semantic Indexing to provide a robust domain-independent source of the extra-syntactic knowledge necessary for noun compound disambiguation.