Self-Supervised Chinese Word Segmentation
IDA '01 Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Information retrieval oriented word segmentation based on character associative strength ranking
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Integrating unsupervised and supervised word segmentation: The role of goodness measures
Information Sciences: an International Journal
Hi-index | 0.00 |
We propose a self-supervised word-segmentation technique for Chinese information retrieval. This method combines the advantages of traditional dictionary based approaches with character based approaches, while overcoming many of their shortcomings. Experiments on TREC data show comparable performance to both the dictionary based and the character based approaches. However, our method is language independent and unsupervised, which provides a promising avenue for constructing accurate multilingual information retrieval systems that are flexible and adaptive.