Grammatical category disambiguation by statistical optimization
Computational Linguistics
Some advances in transformation-based part of speech tagging
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Regular models of phonological rule systems
Computational Linguistics - Special issue on computational phonology
Communications of the ACM
Automata, Languages, and Machines
Automata, Languages, and Machines
Minimization of Sequential Transducers
CPM '94 Proceedings of the 5th Annual Symposium on Combinatorial Pattern Matching
Coping with ambiguity and unknown words through probabilistic models
Computational Linguistics - Special issue on using large corpora: II
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
A practical part-of-speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
A simple rule-based part of speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Ambiguity resolution in a reductionistic parser
EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
Two-level morphology with composition
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 1
Weighted rational transductions and their application to human language processing
HLT '94 Proceedings of the workshop on Human Language Technology
Defense of the ansatz for dynamical hierarchies
Artificial Life
Can We Make Information Extraction More Adaptive?
Information Extraction: Towards Scalable, Adaptable Systems
Compressed Storage of Sparse Finite-State Transducers
WIA '99 Revised Papers from the 4th International Workshop on Automata Implementation
WIA '99 Revised Papers from the 4th International Workshop on Automata Implementation
Bootstrapping an ontology-based information extraction system
Intelligent exploration of the web
A natural language system for retrieval of captioned images
Natural Language Engineering
A divide-and-conquer strategy for shallow parsing of German free texts
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Regular expressions for language engineering
Natural Language Engineering
Transducers from rewrite rules with backreferences
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Compiling regular formalisms with rule features into finite-state automata
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Finite state transducers approximating Hidden Markov Models
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Efficient transformation-based parsing
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Finite-state transducer cascades to extract named entities in texts
Theoretical Computer Science - Implementation and application automata
A rule induction approach to modeling regional pronunciation variation
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Identifying temporal expression and its syntactic role using FST and lexical data from corpus
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
KCAT: a Korean Corpus Annotating Tool minimizing human intervention
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Parallel replacement in finite state calculus
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
A Generic Finite State Compiler for Tagging Rules
Machine Translation
A functional toolkit for morphological and phonological processing, application to a Sanskrit tagger
Journal of Functional Programming
Transformation-based learning in the fast lane
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Independence and commitment: assumptions for rapid training and execution of rule-based POS taggers
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Coaxing confidences from an old friend: probabilistic classifications from transformation rule lists
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
A bimachine compiler for ranked tagging rules
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Efficient dictionary-based text rewriting using subsequential transducers†
Natural Language Engineering
Portuguese Part-of-Speech Tagging Using Entropy Guided Transformation Learning
PROPOR '08 Proceedings of the 8th international conference on Computational Processing of the Portuguese Language
Natural Language Processing as a Foundation of the Semantic Web
Foundations and Trends in Web Science
Look-back and look-ahead in the conversion of Hidden Markov Models into finite state transducers
NeMLaP3/CoNLL '98 Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning
Implementing voting constraints with finite state transducers
FSMNLP '09 Proceedings of the International Workshop on Finite State Methods in Natural Language Processing
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Deciding word neighborhood with universal neighborhood automata
Theoretical Computer Science
MWU-aware part-of-speech tagging with a CRF model and lexical resources
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Constrained atomic term: widening the reach of rule templates in transformation based learning
EPIA'05 Proceedings of the 12th Portuguese conference on Progress in Artificial Intelligence
Word translation disambiguation using multinomial classifiers
IbPRIA'05 Proceedings of the Second Iberian conference on Pattern Recognition and Image Analysis - Volume Part II
ETL ensembles for chunking, NER and SRL
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Constraint grammar parsing with left and right sequential finite transducers
FSMNLP '11 Proceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing
A note on sequential rule-based POS tagging
FSMNLP '11 Proceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing
Hi-index | 0.02 |
Stochastic approaches to natural language processing have often been preferred to rule-based approaches because of their robustness and their automatic training capabilities. This was the case for part-of-speech tagging until Brill showed how state-of-the-art part-of-speech tagging can be achieved with a rule-based tagger by inferring rules from a training corpus. However, current implementations of the rule-based tagger run more slowly than previous approaches. In this paper, we present a finite-state tagger, inspired by the rule-based tagger, that operates in optimal time in the sense that the time to assign tags to a sentence corresponds to the time required to follow a single path in a deterministic finite-state machine. This result is achieved by encoding the application of the rules found in the tagger as a nondeterministic finite-state transducer and then turning it into a deterministic transducer. The resulting deterministic transducer yields a part-of-speech tagger whose speed is dominated by the access time of mass storage devices. We then generalize the techniques to the class of transformation-based systems.