Intra-chunk dependency annotation: expanding Hindi inter-chunk annotated treebank

Authors:
Prudhvi Kosaraju;Samar Husain;Bharat Ram Ambati;Dipti Misra Sharma;Rajeev Sangal
Affiliations:
Intl. Institute of Info. Technology Hyderabad, India;Univ. of Potsdam Potsdam, Germany;Intl. Institute of Info. Technology Hyderabad, India;Intl. Institute of Info. Technology Hyderabad, India;Intl. Institute of Info. Technology Hyderabad, India
Venue:
LAW VI '12 Proceedings of the Sixth Linguistic Annotation Workshop
Year:
2012

Citing 2
Cited 0

Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
A multi-representational and multi-layered treebank for Hindi/Urdu

ACL-IJCNLP '09 Proceedings of the Third Linguistic Annotation Workshop

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present two approaches (rule-based and statistical) for automatically annotating intra-chunk dependencies in Hindi. The intra-chunk dependencies are added to the dependency trees for Hindi which are already annotated with inter-chunk dependencies. Thus, the intra-chunk annotator finally provides a fully parsed dependency tree for a Hindi sentence. In this paper, we first describe the guidelines for marking intra-chunk dependency relations. Although the guidelines are for Hindi, they can easily be extended to other Indian languages. These guidelines are used for framing the rules in the rule-based approach. For the statistical approach, we use MaltParser, a data driven parser. A part of the ICON 2010 tools contest data for Hindi is used for training and testing the MaltParser. The same set is used for testing the rule-based approach.