Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Hi-index | 0.00 |
Sensitive words are the compound words whose syntactic category is different from those of their components. According to the segmentation, a sensitive word may play different roles, leading to significantly different syntactic structures. If a syntactic analysis fails for a Chinese sentence, instead of examining each segmentation alternative in turn, sensitive words should be first examined in order to change the syntactic structure of the sentence. This will lead to a higher efficiency. Our examination of a machine-readable dictionary shows that there are a great number of such words. This shows that sensitive word is a widespread phenomenon in Chinese.