Word association norms, mutual information, and lexicography
Computational Linguistics
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Machine Learning
Retrieving collocations from text: Xtract
Computational Linguistics - Special issue on using large corpora: I
Annotating Chinese collocations with multi information
LAW '07 Proceedings of the Linguistic Annotation Workshop
An improved method for finding bilingual collocation correspondences from monolingual corpora
ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Hi-index | 0.00 |
Most of the existing collocation extraction systems are based on globally significant statistical behaviors without mechanisms to handle different types of collocations. By taking compositionality, substitutability, modifiability and internal associations into consideration, collocations are categorized into four different types in this work. Based on the analysis for each type of collocation, a multi-stage extraction system is designed using different combinations of discriminative features so as to identify different types of collocations in different stages. Perceptron training is employed to optimize the consolidation of discriminative features from different sources. Experiment results show that the achieved performance is much better than most reported work.