The nature of statistical learning theory
The nature of statistical learning theory
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Discriminative Reranking for Natural Language Parsing
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Reducing multiclass to binary: a unifying approach for margin classifiers
The Journal of Machine Learning Research
Target word detection and semantic role chunking using support vector machines
NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Intricacies of Collins' Parsing Model
Computational Linguistics
Introduction to the CoNLL-2000 shared task: chunking
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Use of support vector learning for chunk identification
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Automatic tagging of Arabic text: from raw text to base phrase chunks
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Construct state modification in the Arabic Treebank
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
An efficient part-of-speech tagger for arabic
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
Morphological features for parsing morphologically-rich languages: a case of Arabic
SPMRL '11 Proceedings of the Second Workshop on Statistical Parsing of Morphologically Rich Languages
SAMAR: a system for subjectivity and sentiment analysis of Arabic social media
WASSA '12 Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis
SAMAR: Subjectivity and sentiment analysis for Arabic social media
Computer Speech and Language
Hi-index | 0.00 |
Base Phrase Chunking (BPC) or shallow syntactic parsing is proving to be a task of interest to many natural language processing applications. In this paper, A BPC system is introduced that improves over state of the art performance in BPC using a new part of speech tag (POS) set. The new POS tag set, ERTS, reflects some of the morphological features specific to Modern Standard Arabic. ERTS explicitly encodes definiteness, number and gender information increasing the number of tags from 25 in the standard LDC reduced tag set to 75 tags. For the BPC task, we introduce a more language specific set of definitions for the base phrase annotations. We employ a support vector machine approach for both the POS tagging and the BPC processes. The POS tagging performance using this enriched tag set, ERTS, is at 96.13% accuracy. In the BPC experiments, we vary the feature set along two factors: the POS tag set and a set of explicitly encoded morphological features. Using the ERTS POS tagset, BPC achieves the highest overall Fβ=1 of 96.33% on 10 different chunk types outperforming the use of the standard POS tag set even when explicit morphological features are present.