On the Recognition of Printed Characters of Any Font and Size
IEEE Transactions on Pattern Analysis and Machine Intelligence
Grammatical category disambiguation by statistical optimization
Computational Linguistics
Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems
Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
Extracting semantic hierarchies from a large on-line dictionary
ACL '85 Proceedings of the 23rd annual meeting on Association for Computational Linguistics
Word association norms, mutual information, and lexicography
ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Partial parsing: a report on work in progress
HLT '91 Proceedings of the workshop on Speech and Natural Language
Towards understanding text with a very large vocabulary
HLT '90 Proceedings of the workshop on Speech and Natural Language
Exploiting sophisticated representations for document retrieval
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Review of "Statistical language learning" by Eugene Charniak. The MIT Press 1993.
Computational Linguistics
Extracting significant words from corpora for ontology extraction
Proceedings of the 3rd international conference on Knowledge capture
A machine learning parser using an unlexicalized distituent model
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Estimation of a Priori Decision Threshold for Collocations Extraction: An Empirical Study
International Journal of Information Technology and Web Engineering
Hi-index | 0.00 |
There are a number of collocational constraints in natural languages that ought to play a more important role in natural language parsers. Thus, for example, it is hard for most parsers to take advantage of the fact that wine is typically drunk, produced, and sold, but (probably) not pruned. So too, it is hard for a parser to know which verbs go with which prepositions (e.g., set up) and which nouns fit together to form compound noun phrases (e.g., computer programmer). This paper will attempt to show that many of these types of concerns can be addressed with syntactic methods (symbol pushing), and need not require explicit semantic interpretation. We have found that it is possible to identify many of these interesting co-occurrence relations by computing simple summary statistics over millions of words of text. This paper will summarize a number of experiments carried out by various subsets of the authors over the last few years. The term collocation will be used quite broadly to include constraints on SVO (subject verb object) triples, phrasal verbs, compound noun phrases, and psycholinguistic notions of word association (e.g., doctorinurse).