Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Extracting multiword expressions with a semantic tagger
MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18
Hi-index | 0.00 |
This study includes an original corpus of engineering journals and is part of the series of E-Learning & English for Specific Purposes (ESP) researches . Purposes (ESP) researches that includes an original corpus of engineering journals. In this paper the results of a corpus study will be presented, and a sample of the ESP e-learning materials being developed for graduate students in engineering will be shown. Abstracts were chosen for the corpus this time because students are likely to read many for their research, and eventually to have to produce their own. We prepare the 40,000-word corpus that consists of 263 abstracts from mechanical and electrical engineering journals. The corpus is analyzed using Wmatrix, which gives part-of-speech tags and semantic tags, and compares the results with those of the BNC written corpus sampler. Some special features found in the analysis are frequencies in semantic tags, part-of-speech tags, difference in the use of verbal forms and multi-words. As an application of the important features, we are developing web-based materials which include the original abstracts with target items hyper-linked to various pages containing exercises, concordances, grammar explanations, a bilingual dictionary, etc.