Stochastic simulation
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Extracting the lowest-frequency words: pitfalls and possibilities
Computational Linguistics
Sequential model selection for word sense disambiguation
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Improved Unsupervised Name Discrimination with Very Wide Bigrams and Automatic Cluster Stopping
CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
Determining the syntactic structure of medical terms in clinical notes
BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
A decision tree approach to sentence chunking
AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
The design, implementation, and use of the Ngram statistics package
CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Duluth-WSI: SenseClusters applied to the sense induction task of SemEval-2
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
A new supervised learning algorithm for word sense disambiguation
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
The effect of different context representations on word sense discrimination in biomedical texts
Proceedings of the 1st ACM International Health Informatics Symposium
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Identifying collocations to measure compositionality: shared task system description
DiSCo '11 Proceedings of the Workshop on Distributional Semantics and Compositionality
Hi-index | 0.00 |
Statistical NLP inevitably deals with a large number of rare events. As a consequence, NLP data often violates the assumptions implicit in traditional statistical procedures such as significance testing. We describe a significance test, an exact conditional test, that is appropriate for NLP data and can be performed using freely available software. We apply this test to the study of lexical relationships and demonstrate that the results obtained using this test are both theoretically more reliable and different from the results obtained using previously applied tests.