Learnable classes of categorial grammars
Learnable classes of categorial grammars
ML Techniques and Text Analysis
ECML '93 Proceedings of the European Conference on Machine Learning
Categorial grammars determined from linguistic data by unification
Categorial grammars determined from linguistic data by unification
Polynomial learnability and locality of formal grammars
ACL '88 Proceedings of the 26th annual meeting on Association for Computational Linguistics
Unsupervised Learning of Probabilistic Context-Free Grammar using Iterative Biclustering
ICGI '08 Proceedings of the 9th international colloquium on Grammatical Inference: Algorithms and Applications
Learning Relational Grammars from Sequences of Actions
CIARP '09 Proceedings of the 14th Iberoamerican Conference on Pattern Recognition: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
A randomised inference algorithm for regular tree languages
Natural Language Engineering
Simple unsupervised grammar induction from raw text with cascaded finite state models
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Identification in the limit of substitutable context-free languages
ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
Inductive inference and language learning
TAMC'06 Proceedings of the Third international conference on Theory and Applications of Models of Computation
Hi-index | 0.00 |
In this paper we describe an efficient and scalable implementation for grammar induction based on the EMILE approach [2,3,4,5,6]. The current EMILE 4.1 implementation [11] is one of the first efficient grammar induction algorithms that work on free text. Although EMILE 4.1 is far from perfect, it enables researchers to do empirical grammar induction research on various types of corpora. The EMILE approach is based on notions from categorial grammar (cf. [10]), which is known to generate the class of context-free languages. EMILE learns from positive examples only (cf. [1,7,9]). We describe the algorithms underlying the approach and some interesting practical results on small and large text collections. As shown in the articles mentioned above, in the limit EMILE learns the correct grammatical structure of a language from sentences of that language. The conducted experiments show that, put into practice, EMILE 4.1 is efficient and scalable. This current implementation learns a subclass of the shallow context-free languages. This subclass seems sufficiently rich to be of practical interest. Especially Emile seems to be a valuable tool in the context of syntactic and semantic analysis of large text corpora.