A statistical parser for Czech

Authors:
Michael Collins;Lance Ramshaw;Jan Hajič;Christoph Tillmann
Affiliations:
AT&T Labs-Research, Shannon Laboratory, Florham Park, NJ;BBN Technologies, Cambridge, MA;Charles University, Prague, Czech Republic;Lehrstuhl für Informatik VI, RWTH Aachen, Aachen, Germany
Venue:
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Year:
1999

Citing 9
Cited 66

Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Three generative, lexicalised models for statistical parsing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Tagging inflective languages: prediction of morphological categories for a rich, structured tagset

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Statistical decision-tree models for parsing

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
A new statistical parser based on bigram lexical dependencies

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Three new probabilistic models for dependency parsing: an exploration

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Decision tree parsing using a hidden derivation model

HLT '94 Proceedings of the workshop on Human Language Technology
Statistical parsing with a context-free grammar and word statistics

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

The Current Status of the Prague Dependency Treebank

TSD '01 Proceedings of the 4th International Conference on Text, Speech and Dialogue
A new approach to conceptual document indexing: building a hierarchical system of concepts based on document clusters

ISICT '03 Proceedings of the 1st international symposium on Information and communication technologies
Constraint based integration of deep and shallow parsing techniques

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Converting dependency structures to phrase structures

HLT '01 Proceedings of the first international conference on Human language technology research
The simple core and the complex periphery of natural language a formal and a computational view

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Topic-focus and salience

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Probabilistic parsing for German using sister-head dependencies

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Parsing with generative models of predicate-argument structure

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Deep syntactic processing by combining shallow methods

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Bootstrapping parsers via syntactic projection across parallel texts

Natural Language Engineering
Dependency Parsing with an Extended Finite-State Approach

Computational Linguistics
Head-Driven Statistical Models for Natural Language Parsing

Computational Linguistics
Use of dependency tree structures for the microcontext extraction

RANLPIR '00 Proceedings of the ACL-2000 workshop on Recent advances in natural language processing and information retrieval: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 11
Experiments in parallel-text based grammar induction

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Online large-margin training of dependency parsers

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Pseudo-projective dependency parsing

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Lexicalization in crosslinguistic probabilistic parsing: the case of French

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Graph transformations in data-driven dependency parsing

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Deterministic dependency parsing of English text

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Annotation strategies for probabilistic parsing in German

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Non-projective dependency parsing using spanning tree algorithms

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Morphology and reranking for the statistical parsing of Spanish

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Mildly non-projective dependency structures

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Building a dynamic lexicon from a digital library

Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
Statistical machine translation

ACM Computing Surveys (CSUR)
A Graph Based Method for Building Multilingual Weakly Supervised Dependency Parsers

GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Dependency parsing of turkish

Computational Linguistics
Slavonic information extraction and partial parsing

ACL '07 Proceedings of the Workshop on Balto-Slavonic Natural Language Processing: Information Extraction and Enabling Technologies
Analysis of link grammar on biomedical dependency corpus targeted at protein-protein interactions

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
CoNLL-X shared task on multilingual dependency parsing

CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Language independent probabilistic context-free parsing bolstered by machine learning

CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Three-dimensional parametrization for parsing morphologically rich languages

IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
Multiple-step treebank conversion: from dependency to Penn format

LAW '07 Proceedings of the Linguistic Annotation Workshop
Corrective modeling for non-projective dependency parsing

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Improving parsing accuracy by combining diverse dependency parsers

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Exploiting heterogeneous treebanks for parsing

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
A multi-representational and multi-layered treebank for Hindi/Urdu

ACL-IJCNLP '09 Proceedings of the Third Linguistic Annotation Workshop
An empirical study of semi-supervised structured conditional models for dependency parsing

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
An alternative to head-driven approaches for parsing a (relatively) free word-order language

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Accurate unlexicalized parsing for modern Hebrew

TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
Dependency and phrasal parsers of the Czech language: a comparison

TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
Feature engineering in maximum spanning tree dependency parser

TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
A simulated shallow dependency parser based on weighted hierarchical structure learning

AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Statistical parsing of morphologically rich languages (SPMRL): what, how and whither

SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Improving Arabic dependency parsing with lexical and inflectional morphological features

SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Application of different techniques to dependency parsing of Basque

SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Better Arabic parsing: baselines, evaluations, and analysis

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Phrase structure parsing with dependency structure

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Automatic treebank conversion via informed decoding

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Syntactic analysis using finite patterns: a new parsing system for Czech

LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
Improving Arabic dependency parsing with form-based and functional morphological features

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Automatic Treebank Conversion via Informed Decoding - A Case Study on Chinese Treebanks

ACM Transactions on Asian Language Information Processing (TALIP)
Producing Power-Law Distributions and Damping Word Frequencies with Two-Stage Language Models

The Journal of Machine Learning Research
The incremental use of morphological information and lexicalization in data-driven dependency parsing

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Multi-source transfer of delexicalized dependency parsers

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Automatically inducing a part-of-speech tagger by projecting from multiple source languages across aligned corpora

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Combining czech dependency parsers

TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
Problems of inducing large coverage constraint-based dependency grammar for czech

CSLP'04 Proceedings of the First international conference on Constraint Solving and Language Processing
Adaptation of data and models for probabilistic parsing of portuguese

PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
Morphological features for parsing morphologically-rich languages: a case of Arabic

SPMRL '11 Proceedings of the Second Workshop on Statistical Parsing of Morphologically Rich Languages
On the adequacy of three POS taggers and a dependency parser

CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Exploiting multiple treebanks for parsing with quasi-synchronous grammars

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Using parallel features in parsing of machine-translated sentences for correction of grammatical errors

SSST-6 '12 Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation
Parsing morphologically rich languages: Introduction to the special issue

Computational Linguistics
Morphological and syntactic case in statistical dependency parsing

Computational Linguistics
Dependency parsing of modern standard arabic with lexical and inflectional features

Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper considers statistical parsing of Czech, which differs radically from English in at least two respects: (1) it is a highly inflected language, and (2) it has relatively free word order. These differences are likely to pose new problems for techniques that have been developed on English. We describe our experience in building on the parsing model of (Collins 97). Our final results- 80% dependency accuracy - represent good progress towards the 91% accuracy of the parser on English (Wall Street Journal) text.