Data-Oriented Parsing
Inducing multilingual POS taggers and NP bracketers via robust projection across aligned corpora
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Bootstrapping parsers via syntactic projection across parallel texts
Natural Language Engineering
A hybrid approach to align sentences and words in English-Hindi parallel corpora
ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Rich bitext projection features for parse reranking
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Hi-index | 0.01 |
Example-based parsing has already been proposed in literature. In particular, attempts are being made to develop techniques for language pairs where the source and target languages are different, e.g. Direct Projection Algorithm (Hwa et al., 2005). This enables one to develop parsed corpus for target languages having fewer linguistic tools with the help of a resource-rich source language. The DPA algorithm works on the assumption of Direct Correspondence which simply means that the relation between two words of the source language sentence can be projected directly between the corresponding words of the parallel target language sentence. However, we find that this assumption does not hold good all the time. This leads to wrong parsed structure of the target language sentence. As a solution we propose an algorithm called pseudo DPA (pDPA) that can work even if Direct Correspondence assumption is not guaranteed. The proposed algorithm works in a recursive manner by considering the embedded phrase structures from outermost level to the innermost. The present work discusses the pDPA algorithm, and illustrates it with respect to English-Hindi language pair. Link Grammar based parsing has been considered as the underlying parsing scheme for this work.