Implementing Alignment-Based Learning

Authors:
Menno van Zaanen
Affiliations:
-
Venue:
ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
Year:
2002

Citing 3
Cited 4

Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Three generative, lexicalised models for statistical parsing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A DOP model for semantic interpretation

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics

AN UNSUPERVISED INCREMENTAL LEARNING ALGORITHM FOR DOMAIN-SPECIFIC LANGUAGE DEVELOPMENT

Applied Artificial Intelligence
Limitations of current grammar induction algorithms

ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
Language structure using fuzzy similarity

IEEE Transactions on Fuzzy Systems
Computational models of language acquisition

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this article, the current implementation of the Alignment-Based Learning (ABL) framework (van Zaanen, 2002) will be described. ABL is an unsupervised grammar induction system that is based on Harris's (1951) idea of substitutability. Instances of the framework can be applied to an untagged, unstructured corpus of natural language sentences, resulting in a labelled, bracketed version of that corpus.Firstly, the framework aligns all sentences in the corpus in pairs, resulting in a partition of the sentences consisting of parts of the sentences that are equal in both sentences and parts that are unequal. Since substituting one unequal part for the other results in another valid sentence, the unequal parts of the sentences are considered to be possible (possibly overlapping) constituents. Secondly, of all possible constituents found by the first phase, the best are selected.