Using statistical testing in the evaluation of retrieval experiments
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Chinese text segmentation for text retrieval: achievements and problems
Journal of the American Society for Information Science
Overview of the second text retrieval conference (TREC-2)
TREC-2 Proceedings of the second conference on Text retrieval conference
ACTS: an automatic Chinese text segmentation system for full text retrieval
Journal of the American Society for Information Science
A stochastic finite-state word-segmentation algorithm for Chinese
Computational Linguistics
Parallel Chinese word segmentation algorithm based on maximum matching
Neural, Parallel & Scientific Computations
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
On the use of words and n-grams for Chinese information retrieval
IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
Information Retrieval
Managing Gigabytes: Compressing and Indexing Documents and Images
Managing Gigabytes: Compressing and Indexing Documents and Images
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Comparative study of monolingual and multilingual search models for use with asian languages
ACM Transactions on Asian Language Information Processing (TALIP)
Inferential language models for information retrieval
ACM Transactions on Asian Language Information Processing (TALIP)
Chinese word segmentation as morpheme-based lexical chunking
Information Sciences: an International Journal
Expert Systems with Applications: An International Journal
Current research issues and trends in non-English Web searching
Information Retrieval
Addressing morphological variation in alphabetic languages
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Automatic seed word selection for unsupervised sentiment classification of Chinese text
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Information retrieval oriented word segmentation based on character associative strength ranking
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Multimedia search capabilities of Chinese language search engines
Information Processing and Management: an International Journal
Relevance measurement on chinese search results
HCI'07 Proceedings of the 12th international conference on Human-computer interaction: applications and services
Sub-Word Indexing and Blind Relevance Feedback for English, Bengali, Hindi, and Marathi IR
ACM Transactions on Asian Language Information Processing (TALIP)
Using Markov chains to exploit word relationships in information retrieval
Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Managing misspelled queries in IR applications
Information Processing and Management: an International Journal
Automatic construction of Chinese stop word list
ACOS'06 Proceedings of the 5th WSEAS international conference on Applied computer science
Retrieval effectiveness of cross language information retrieval search engines
ICADL'11 Proceedings of the 13th international conference on Asia-pacific digital libraries: for cultural heritage, knowledge dissemination, and future creation
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
A cross-lingual framework for web news taxonomy integration
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Statistical and comparative evaluation of various indexing and search models
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
The adaptability of english based web search algorithms to chinese search engines
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
A GPU-Based accelerator for chinese word segmentation
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
A new method to compose long unknown Chinese keywords
Journal of Information Science
Electronic word of mouth analysis for service experience
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
A set of IR experiments was carried out to study the impact of Chinese word segmentation and its effect on information retrieval (IR) at the Division of Information Studies, Nanyang Technological University, Singapore. A total of four automatic character-based segmentation approaches and a manual word segmentation approach was first carried out to obtain the word segments for indexing and to evaluate the segmentation accuracy of these automatic approaches. The IR experiments study both the influence of different document segmentation approaches on IR effectiveness and the methods used for query segmentation. Traditional data recall and precision measures were used to gauge IR effectiveness. A number of queries were selected and subjected to further detailed analysis to further explore the influence of word segmentation on IR.The findings reveal that the segmentation approach has an effect on IR effectiveness. Better IR results are obtained by using the same method for query and document processing as this increase the probability of the query-document match. The recognition of a higher number of 2-character words generally contributes to the improvement of IR effectiveness. However, manual segmentation does not always work better than character-based segmentation as a result of the existence of longer words with more than two characters. No evidence is found that ambiguous words resulting from the segmentation process significantly affect IR.