Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Adding predicate argument structure to the Penn TreeBank
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Hi-index | 0.00 |
Language resources are very important for natural language processing research and applications. This paper will introduce our ongoing research work to build a situation-based language knowledge base for the Chinese language, based on two basic language resources: three Chinese semantic lexicons and a large scale Chinese treebank. We developed a supporting platform to make full use of the abundant information contained in current Chinese semantic lexicons so as to gradually summarize the complete situation descriptions, organize them as situation network and build corresponding descriptive definition dictionary for different concepts. We explored an efficient algorithm to link from syntax to semantics so as to introduce suitable semantic explanations into current Chinese treebank and gradually build a situation-based semantically-annotated corpus. All these research work will lay a good foundation for the computational infrastructure in Chinese natural language processing.