Handbook of formal languages, vol. 3
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
PCFG models of linguistic tree representations
Computational Linguistics
A maximum-entropy-inspired parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Using an annotated corpus as a stochastic grammar
EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
What is the minimal set of fragments that achieves maximal parse accuracy?
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Learning accurate, compact, and interpretable tree annotation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
LIBLINEAR: A Library for Large Linear Classification
The Journal of Machine Learning Research
GenERRate: generating errors for use in grammatical error detection
EdAppsNLP '09 Proceedings of the Fourth Workshop on Innovative Use of NLP for Building Educational Applications
Inducing compact but accurate tree-substitution grammars
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Bayesian learning of a tree substitution grammar
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Simple, accurate parsing with an all-fragments grammar
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Syntax-based language models for statistical machine translation
Syntax-based language models for statistical machine translation
Character-based kernels for novelistic plot structure
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Stylometric analysis of scientific articles
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Judging grammaticality with count-induced tree substitution grammars
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Large-scale syntactic language modeling with treelets
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Native language detection with tree substitution grammars
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Hi-index | 0.00 |
In this paper, we show that local features computed from the derivations of tree substitution grammars --- such as the identify of particular fragments, and a count of large and small fragments --- are useful in binary grammatical classification tasks. Such features outperform n-gram features and various model scores by a wide margin. Although they fall short of the performance of the hand-crafted feature set of Charniak and Johnson (2005) developed for parse tree reranking, they do so with an order of magnitude fewer features. Furthermore, since the TSGs employed are learned in a Bayesian setting, the use of their derivations can be viewed as the automatic discovery of tree patterns useful for classification. On the BLLIP dataset, we achieve an accuracy of 89.9% in discriminating between grammatical text and samples from an n-gram language model.