IEEE Transactions on Pattern Analysis and Machine Intelligence
Elements of information theory
Elements of information theory
Efficient learning of context-free grammars from positive structural examples
Information and Computation
Class-based n-gram models of natural language
Computational Linguistics
The inference of tree languages from finite samples: an algebraic approach
Theoretical Computer Science
An efficient probabilistic context-free parsing algorithm that computes prefix probabilities
Computational Linguistics
Arithmetic coding for data compression
Communications of the ACM
Minimal Ascending and Descending Tree Automata
SIAM Journal on Computing
IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical methods for speech recognition
Statistical methods for speech recognition
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Experiments in text file compression
Communications of the ACM
Handbook of Formal Languages
On the Estimation of 'Small' Probabilities by Leaving-One-Out
IEEE Transactions on Pattern Analysis and Machine Intelligence
Stochastic k-testable Tree Languages and Applications
ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Solution of an Open Problem on Probabilistic Grammars
IEEE Transactions on Computers
Stochastic k-testable Tree Languages and Applications
ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
LARS: A learning algorithm for rewriting systems
Machine Learning
Detecting Irrelevant Subtrees to Improve Probabilistic Learning from Tree-structured Data
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Learning context-free grammar using improved tabular representation
Applied Soft Computing
Language structure using fuzzy similarity
IEEE Transactions on Fuzzy Systems
A bibliographical study of grammatical inference
Pattern Recognition
Smoothing and compression with stochastic k-testable tree languages
Pattern Recognition
Recognizable tree series with discounting
Acta Cybernetica
Detecting Irrelevant Subtrees to Improve Probabilistic Learning from Tree-structured Data
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Hi-index | 0.00 |
In this paper, we describe a generalization for tree stochastic languages of the k-gram models. These models are based on the k- testable class, a subclass of the languages recognizable by ascending tree automata. One of the advantages of this approach is that the probabilistic model can be updated in an incremental fashion. Another feature is that backing-off schemes can be defined. As an illustration of their applicability, they have been used to compress tree data files at a better rate than string-based methods.