Techniques for automatically correcting words in text
ACM Computing Surveys (CSUR)
Viewing morphology as an inference process
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Regular models of phonological rule systems
Computational Linguistics - Special issue on computational phonology
Deterministic part-of-speech tagging with finite-state transducers
Computational Linguistics
Efficient string matching: an aid to bibliographic search
Communications of the ACM
Conceptual linking: ontology-based open hypermedia
Proceedings of the 10th international conference on World Wide Web
Automata and Computability
Finite-State Language Processing
Finite-State Language Processing
Introduction to Automata Theory, Languages and Computability
Introduction to Automata Theory, Languages and Computability
Data Structures and Algorithms
Data Structures and Algorithms
Lexical Postcorrection of OCR-Results: The Web as a Dynamic Secondary Dictionary?
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Incremental construction of minimal acyclic finite-state automata
Computational Linguistics - Special issue on finite-state methods in NLP
Finite-state transducers in language and speech processing
Computational Linguistics
A simple rule-based part of speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
On some applications of finite-state automata theory to natural language processing
Natural Language Engineering
Partial parsing via finite-state cascades
Natural Language Engineering
Transducers from rewrite rules with backreferences
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
An efficient compiler for weighted rewrite rules
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Example-Based Machine Translation in the Pangloss system
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Translation with cascaded finite state transducers
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Integrated document browsing and data acquisition for building large ontologies
KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III
A note on sequential rule-based POS tagging
FSMNLP '11 Proceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing
Hi-index | 0.00 |
Problems in the area of text and document processing can often be described as text rewriting tasks: given an input text, produce a new text by applying some fixed set of rewriting rules. In its simplest form, a rewriting rule is given by a pair of strings, representing a source string (the “original”) and its substitute. By a rewriting dictionary, we mean a finite list of such pairs; dictionary-based text rewriting means to replace in an input text occurrences of originals by their substitutes. We present an efficient method for constructing, given a rewriting dictionary D, a subsequential transducer that accepts any text t as input and outputs the intended rewriting result under the so-called “leftmost-longest match” replacement with skips, t'. The time needed to compute the transducer is linear in the size of the input dictionary. Given the transducer, any text t of length |t| is rewritten in a deterministic manner in time O(|t|+|t'|), where t' denotes the resulting output text. Hence the resulting rewriting mechanism is very efficient. As a second advantage, using standard tools, the transducer can be directly composed with other transducers to efficiently solve more complex rewriting tasks in a single processing step.