Learning to Understand Information on the Internet: AnExample-Based Approach
Journal of Intelligent Information Systems - Special issue: next generation information technologies and systems
A scalable comparison-shopping agent for the World-Wide Web
AGENTS '97 Proceedings of the first international conference on Autonomous agents
Computational aspects of resilient data extraction from semistructured sources (extended abstract)
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Computational Linguistics - Special issue on inheritance: II
Acquisition of a lexicon from semantic representations of sentences
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
The PSI/PHI architecture for prosodic parsing
COLING '88 Proceedings of the 12th conference on Computational linguistics - Volume 1
Ethological data mining: an automata-based approach to extract behavioral units and rules
Data Mining and Knowledge Discovery
Software agents: completing patterns and constructing user interfaces
Journal of Artificial Intelligence Research
Identifying hierarchical structure in sequences: a linear-time algorithm
Journal of Artificial Intelligence Research
Pattern extraction improves automata-based syntax analysis in songbirds
ACAL'07 Proceedings of the 3rd Australian conference on Progress in artificial life
Design patterns for metamodels
Proceedings of the compilation of the co-located workshops on DSM'11, TMC'11, AGERE!'11, AOOPES'11, NEAT'11, & VMIL'11
Constructing song syntax by automata induction
ICGI'06 Proceedings of the 8th international conference on Grammatical Inference: algorithms and applications
Hi-index | 0.00 |
In this paper we propose an explicit computer model for learning natural language syntax based on Angluin's (1982) efficient induction algorithms, using a complete corpus of grammatical example sentences. We use these results to show how inductive inference methods may be applied to learn substantial, coherent subparts of at least one natural language – English – that are not susceptible to the kinds of learning envisioned in linguistic theory. As two concrete case studies, we show how to learn English auxiliary verb sequences (such as could be taking, will have been taking) and the sequences of articles and adjectives that appear before noun phrases (such as the very old big deer). Both systems can be acquired in a computationally feasible amount of time using either positive examples, or, in an incremental mode, with implicit negative examples (examples outside a finite corpus are considered to be negative examples). As far as we know, this is the first computer procedure that learns a full-scale range of noun subclasses and noun phrase structure. The generalizations and the time required for acquisition match our knowledge of child language acquisition for these two cases. More importantly, these results show that just where linguistic theories admit to highly irregular subportions, we can apply efficient automata-theoretic learning algorithms. Since the algorithm works only for fragments of language syntax, we do not believe that it suffices for all of language acquisition. Rather, we would claim that language acquisition is nonuniform and susceptible to a variety of acquisition strategies; this algorithm may be one these.