Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Arabic syntactic trees: from constituency to dependency
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
The Penn Treebank: annotating predicate argument structure
HLT '94 Proceedings of the workshop on Human Language Technology
Design of a multi-lingual, parallel-processing statistical parsing engine
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Multi-lingual coreference resolution with syntactic features
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Three-dimensional parametrization for parsing morphologically rich languages
IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
Automatic treebank-based acquisition of Arabic LFG dependency structures
Semitic '09 Proceedings of the EACL 2009 Workshop on Computational Approaches to Semitic Languages
Smoothing a lexicon-based POS tagger for Arabic and Hebrew
Semitic '07 Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources
Handling unknown words in statistical latent-variable parsing models for Arabic, English and French
SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Better Arabic parsing: baselines, evaluations, and analysis
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Modelling discourse relations for Arabic
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Developing ARET: an NLP-based educational tool set for Arabic reading enhancement
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Discourse structure and language technology
Natural Language Engineering
Natural language inference for arabic using extended tree edit distance with subtrees
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
In this paper we address the following questions from our experience of the last two and a half years in developing a large-scale corpus of Arabic text annotated for morphological information, part-of-speech, English gloss, and syntactic structure: (a) How did we 'leapfrog' through the stumbling blocks of both methodology and training in setting up the Penn Arabic Treebank (ATB) annotation? (b) How did we reconcile the Penn Treebank annotation principles and practices with the Modern Standard Arabic (MSA) traditional and more recent grammatical concepts? (c) What are the current issues and nagging problems? (d) What has been achieved and what are our future expectations?