Algorithms for approximate string matching
Information and Control
Semirings, automata, languages
Semirings, automata, languages
Rational series and their languages
Rational series and their languages
Introduction to algorithms
Synchronized rational relations of finite and infinite words
Theoretical Computer Science - Selected papers of the International Colloquium on Words, Languages and Combinatorics, Kyoto, Japan, August 1990
Text algorithms
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
IEEE Transactions on Pattern Analysis and Machine Intelligence
The String-to-String Correction Problem
Journal of the ACM (JACM)
A design principles of a weighted finite-state transducer library
Theoretical Computer Science - Special issue on implementing automata
Automata, Languages, and Machines
Automata, Languages, and Machines
Automata: Theoretic Aspects of Formal Power Series
Automata: Theoretic Aspects of Formal Power Series
Semiring frameworks and algorithms for shortest-distance problems
Journal of Automata, Languages and Combinatorics
Efficient algorithms for testing the twins property
Journal of Automata, Languages and Combinatorics - Special issue: Selected papers of the workshop weighted automata: Theory and applications (Dresden University of Technology (Germany), March 4-8, 2002)
Finite-state transducers in language and speech processing
Computational Linguistics
An efficient compiler for weighted rewrite rules
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Multilingual modeling of cross-lingual spelling variants
Information Retrieval
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Corrective models for speech recognition of inflected languages
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Selected operations and applications of n-tape weighted finite-state machines
FSMNLP'09 Proceedings of the 8th international conference on Finite-state methods and natural language processing
Integrating imperfect transcripts into speech recognition systems for building high-quality corpora
Computer Speech and Language
A class of rational n-WFSM auto-intersections
CIAA'05 Proceedings of the 10th international conference on Implementation and Application of Automata
Normalizing historical orthography for OCR historical documents using LSTM
Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing
Hi-index | 0.00 |
The edit-distance of two strings is the minimal cost of a sequence of symbol insertions, deletions, or substitutions transforming one string into the other. The definition is used in various contexts to give a measure of the difference or similarity between two strings. This definition can be extended to measure the similarity between two sets of strings. In particular, when these sets are represented by automata, their edit-distance can be computed using the general algorithm of composition of weighted transducers combined with a single-source shortest-paths algorithm. More generally, in some applications such as speech recognition and computational biology, the strings may represent a range of alternative hypotheses with associated probabilities. Thus, we introduce the definition of the edit-distance of two distributions of strings given by two weighted automata. We show that general weighted automata algorithms over the appropriate semirings can be used to compute the edit-distance of two weighted automata exactly. The algorithm for computing exactly the edit-distance of weighted automata can be used to improve the word accuracy of automatic speech recognition systems. More generally, the algorithm can be extended to provide an edit-distance automaton useful for rescoring and other post-processing purposes in the context of large-vocabulary speech recognition. In the course of the presentation of our algorithm, we also introduce a new and general synchronization algorithm for weighted transducers which, combined with Ɛ-removal, can be used to normalize weighted transducers with bounded delays.