Language Simplification through Error-Correcting and Grammatical Inference Techniques

Authors:
Juan-Carlos Amengual;Alberto Sanchis;Enrique Vidal;José-Miguel Benedí
Affiliations:
Universidad Jaume I, Campus de Riu Sec, 12071 Castellón, Spain. jcamen@inf.uji.es;Instituto Tecnológico de Informática, Camino de Vera s/n, 46071 Valencia, Spain. asanchis@iti.upv.es;Instituto Tecnológico de Informática, Camino de Vera s/n, 46071 Valencia, Spain. evidal@iti.upv.es;Instituto Tecnológico de Informática, Camino de Vera s/n, 46071 Valencia, Spain. jbenedi@iti.upv.es
Venue:
Machine Learning
Year:
2001

Citing 6
Cited 6

Experiments in dynamic programming inference of Markov networks with strings representing speech data

Pattern Recognition
Modelling (sub)string-length based constraints through a grammatical inference method

Proc. of the NATO Advanced Study Institute on Pattern recognition theory and applications
An efficient algorithm for the inference of circuit-free automata

Syntactic and structural pattern recognition
String matching with left-to-right networks

Pattern Recognition Letters
Learning String-Edit Distance

IEEE Transactions on Pattern Analysis and Machine Intelligence
Efficient Error-Correcting Viterbi Parsing

IEEE Transactions on Pattern Analysis and Machine Intelligence

Probabilistic Finite-State Machines-Part II

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning finite-state models for machine translation

Machine Learning
Learning Balls of Strings with Correction Queries

ECML '07 Proceedings of the 18th European conference on Machine Learning
Learning Balls of Strings from Edit Corrections

The Journal of Machine Learning Research
A bibliographical study of grammatical inference

Pattern Recognition
String distances and uniformities

ICANNGA'09 Proceedings of the 9th international conference on Adaptive and natural computing algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many language processing tasks, most of the sentences generally convey rather simple meanings. Moreover, these tasks have a limited semantic domain that can be properly covered with a simple lexicon and a restricted syntax. Nevertheless, casual users are by no means expected to comply with any kind of formal syntactic restrictions due to the inherent “spontaneous” nature of human language. In this work, the use of error-correcting-based learning techniques is proposed to cope with the complex syntactic variability which is generally exhibited by natural language. In our approach, a complex task is modeled in terms of a basic finite state model, F, and a stochastic error model, E. F should account for the basic (syntactic) structures underlying this task, which would convey the meaning. E should account for general vocabulary variations, word disappearance, superfluous words, and so on. Each “natural” user sentence is thus considered as a corrupted version (according to E) of some “simple” sentence of L(F). Adequate bootstrapping procedures are presented that incrementally improve the “structure” of F while estimating the probabilities for the operations of E. These techniques have been applied to a practical task of moderately high syntactic variability, and the results which show the potential of the proposed approach are presented.