Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Three generative, lexicalised models for statistical parsing
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A DOP model for semantic interpretation
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
AN UNSUPERVISED INCREMENTAL LEARNING ALGORITHM FOR DOMAIN-SPECIFIC LANGUAGE DEVELOPMENT
Applied Artificial Intelligence
Limitations of current grammar induction algorithms
ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
Language structure using fuzzy similarity
IEEE Transactions on Fuzzy Systems
Computational models of language acquisition
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Hi-index | 0.00 |
In this article, the current implementation of the Alignment-Based Learning (ABL) framework (van Zaanen, 2002) will be described. ABL is an unsupervised grammar induction system that is based on Harris's (1951) idea of substitutability. Instances of the framework can be applied to an untagged, unstructured corpus of natural language sentences, resulting in a labelled, bracketed version of that corpus.Firstly, the framework aligns all sentences in the corpus in pairs, resulting in a partition of the sentences consisting of parts of the sentences that are equal in both sentences and parts that are unequal. Since substituting one unequal part for the other results in another valid sentence, the unequal parts of the sentences are considered to be possible (possibly overlapping) constituents. Secondly, of all possible constituents found by the first phase, the best are selected.