Data mining for grammatical inference with bioinformatics criteria

Authors:
Vivian F. López;Ramiro Aguilar;Luis Alonso;María N. Moreno
Affiliations:
Departament Informática y Automática, University of Salamanca, Plaza de la Merced S/N, 37008 Salamanca, Spain;Departament Informática y Automática, University of Salamanca, Plaza de la Merced S/N, 37008 Salamanca, Spain;Departament Informática y Automática, University of Salamanca, Plaza de la Merced S/N, 37008 Salamanca, Spain;Departament Informática y Automática, University of Salamanca, Plaza de la Merced S/N, 37008 Salamanca, Spain
Venue:
Expert Systems with Applications: An International Journal
Year:
2012

Citing 1
Cited 1

A bibliographical study of grammatical inference

Pattern Recognition

Exemplar driven development of software product lines

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	12.05

Visualization

Abstract

In this work a novel data mining process is described that combines hybrid techniques of association analysis and classical sequentiation algorithms of genomics, to generate grammatical structures of a specific language. Subsequently, these structures are converted to Context-Free Grammars. Initially the method applies to context-free languages with the possibility of being applied to other languages: structured programming, the language of the book of life expressed in the genome and proteome and even the natural languages. We used an application of a compilers generator system that allows the development of a practical application within the area of grammarware, where the concepts of the language analysis are applied to other disciplines, like bioinformatic. The tool allows measuring the complexity of the obtained grammar automatically from textual data.