Large lexicons for natural language processing: utilising the grammar coding system of LDOCE
Computational Linguistics - Special issue of the lexicon
Learning structure and concepts in data through data clustering
Learning structure and concepts in data through data clustering
Automatic extraction of subcategorization from corpora
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Comlex Syntax: building a computational lexicon
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Inducing German semantic verb classes from purely syntactic subcategorisation information
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Clustering polysemic subcategorization frame distributions semantically
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Some experiments on indicators of parsing complexity for lexicalized grammars
Proceedings of the COLING-2000 Workshop on Efficiency In Large-Scale Parsing Systems
Hi-index | 0.00 |
This paper presents a method of improving the accuracy of subcategorization frames (SCFs) acquired from corpora to augment existing lexicon resources. I estimate a confidence value of each SCF using corpus-based statistics, and then perform clustering of SCF confidence-value vectors for words to capture cooccurrence tendency among SCFs in the lexicon. I apply my method to SCFs acquired from corpora using lexicons of two large-scale lexicalized grammars. The resulting SCFs achieve higher precision and recall compared to SCFs obtained by naive frequency cut-off.