IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning Automata from Ordered Examples
Machine Learning - Connectionist approaches to language learning
Distributed Representations, Simple Recurrent Networks, And Grammatical Structure
Machine Learning - Connectionist approaches to language learning
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
The Hierarchical Hidden Markov Model: Analysis and Applications
Machine Learning
Learning Regular Languages from Simple Positive Examples
Machine Learning
Machine Learning
Machine Learning
A Polynominal Time Incremental Algorithm for Learning DFA
ICGI '98 Proceedings of the 4th International Colloquium on Grammatical Inference
Learning DFA from Simple Examples
ALT '97 Proceedings of the 8th International Conference on Algorithmic Learning Theory
Formal languages and their relation to automata
Formal languages and their relation to automata
Hierarchical hidden Markov models for information extraction
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Regular expression learning for information extraction
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Efficient schema extraction from a large collection of XML documents
Proceedings of the 49th Annual Southeast Regional Conference
A methodological contribution to music sequences analysis
ISMIS'06 Proceedings of the 16th international conference on Foundations of Intelligent Systems
Hi-index | 0.00 |
The presence of long gaps dramatically increases the diffculty of detecting and characterizing complex events hidden in long sequences. In order to cope with this problem, a learning algorithm based on an abstraction mechanism is proposed: it can infer the general model of complex events from a set of learning sequences. Events are described by means of regular expressions, and the abstraction mechanism is based on the substitution property of regular languages. The induction algorithm proceeds bottom-up, progressively coarsening the sequence granularity, letting correlations between subsequences, separated by long gaps, naturally emerge. Two abstraction operators are defined. The first one detects, and abstracts into non-terminal symbols, regular expressions not containing iterative constructs. The second one detects and abstracts iterated subsequences. By interleaving the two operators, regular expressions in general form may be inferred. Both operators are based on string alignment algorithms taken from bio-informatics. A restricted form of the algorithm has already been outlined in previous papers, where the emphasis was on applications. Here, the algorithm, in an extended version, is described and analyzed into details.