A stochastic finite-state word-segmentation algorithm for Chinese
Computational Linguistics
Learning to Parse Natural Language with Maximum Entropy Models
Machine Learning - Special issue on natural language learning
Large Margin Classification Using the Perceptron Algorithm
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maximum entropy models for natural language ambiguity resolution
Maximum entropy models for natural language ambiguity resolution
Head-driven statistical models for natural language parsing
Head-driven statistical models for natural language parsing
Ultraconservative online algorithms for multiclass problems
The Journal of Machine Learning Research
Generalized probabilistic LR parsing of natural language (Corpora) with unification-based grammars
Computational Linguistics - Special issue on using large corpora: I
Probabilistic top-down parsing and language modeling
Computational Linguistics
A maximum-entropy-inspired parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Support vector machine learning for interdependent and structured output spaces
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Recovering latent information in treebanks
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Is it harder to parse Chinese, or the Chinese Treebank?
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
On the parameter space of generative lexicalized statistical parsing models
On the parameter space of generative lexicalized statistical parsing models
Two statistical parsing models applied to the Chinese Treebank
CLPW '00 Proceedings of the second workshop on Chinese language processing: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 12
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
The first international Chinese word segmentation Bakeoff
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Incremental parsing with the perceptron algorithm
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Online large-margin training of dependency parsers
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Chinese segmentation and new word detection using conditional random fields
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Non-projective dependency parsing using spanning tree algorithms
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Online Passive-Aggressive Algorithms
The Journal of Machine Learning Research
Wide-coverage efficient statistical parsing with ccg and log-linear models
Computational Linguistics
Probabilistic Models for Action-Based Chinese Dependency Parsing
ECML '07 Proceedings of the 18th European conference on Machine Learning
A hybrid approach to word segmentation and POS tagging
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
CoNLL-X shared task on multilingual dependency parsing
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Projective dependency parsing with perceptron
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Labeled pseudo-projective dependency parsing with support vector machines
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
TAG, dynamic programming, and the perceptron for efficient, feature-rich parsing
CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
Word lattice reranking for Chinese word segmentation and part-of-speech tagging
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Dependency parsing by belief propagation
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Parser combination by reparsing
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Subword-based tagging by conditional random fields for Chinese word segmentation
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
A dual-layer CRFs based joint decoding method for cascaded segmentation and labeling tasks
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
A classifier-based parser with linear run-time complexity
Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Concise integer linear programming formulations for dependency parsing
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
An error-driven word-character hybrid model for joint Chinese word segmentation and POS tagging
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Transition-based parsing of the Chinese treebank using a global discriminative model
IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
Improving dependency parsing with subtrees from auto-parsed data
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Dynamic programming for linear-time incremental parsing
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A fast decoder for joint word segmentation and POS-tagging using a single discriminative model
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Parsing the penn chinese treebank with semantic knowledge
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Integrating Generative and Discriminative Character-Based Models for Chinese Word Segmentation
ACM Transactions on Asian Language Information Processing (TALIP)
Managing uncertainty in semantic tagging
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Hybrid combination of constituency and dependency trees into an ensemble dependency parser
HYBRID '12 Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data
Joint Chinese word segmentation, POS tagging and parsing
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Unified dependency parsing of Chinese morphological and syntactic structures
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Data-driven multilingual coreference resolution using resolver stacking
CoNLL '12 Joint Conference on EMNLP and CoNLL - Shared Task
Applying piecewise approximation in perceptron training of conditional random fields
IDA'12 Proceedings of the 11th international conference on Advances in Intelligent Data Analysis
Joint Optimization for Chinese POS Tagging and Dependency Parsing
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
We study a range of syntactic processing tasks using a general statistical framework that consists of a global linear model, trained by the generalized perceptron together with a generic beam-search decoder. We apply the framework to word segmentation, joint segmentation and POS-tagging, dependency parsing, and phrase-structure parsing. Both components of the framework are conceptually and computationally very simple. The beam-search decoder only requires the syntactic processing task to be broken into a sequence of decisions, such that, at each stage in the process, the decoder is able to consider the top-n candidates and generate all possibilities for the next stage. Once the decoder has been defined, it is applied to the training data, using trivial updates according to the generalized perceptron to induce a model. This simple framework performs surprisingly well, giving accuracy results competitive with the state-of-the-art on all the tasks we consider. The computational simplicity of the decoder and training algorithm leads to significantly higher test speeds and lower training times than their main alternatives, including log-linear and large-margin training algorithms and dynamic-programming for decoding. Moreover, the framework offers the freedom to define arbitrary features which can make alternative training and decoding algorithms prohibitively slow. We discuss how the general framework is applied to each of the problems studied in this article, making comparisons with alternative learning and decoding algorithms. We also show how the comparability of candidates considered by the beam is an important factor in the performance. We argue that the conceptual and computational simplicity of the framework, together with its language-independent nature, make it a competitive choice for a range of syntactic processing tasks and one that should be considered for comparison by developers of alternative approaches.