A hierarchical approach to wrapper induction
Proceedings of the third annual conference on Autonomous Agents
Learning Information Extraction Rules for Semi-Structured and Free Text
Machine Learning - Special issue on natural language learning
ACM SIGKDD Explorations Newsletter
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
A comparison of Chinese document indexing strategies and retrieval models
ACM Transactions on Asian Language Information Processing (TALIP)
A Chinese dictionary construction algorithm for information retrieval
ACM Transactions on Asian Language Information Processing (TALIP)
Chinese word segmentation and its effect on information retrieval
Information Processing and Management: an International Journal
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
China web graph measurements and evolution
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Hi-index | 0.00 |
Much research in recent years has been devoted to meta-search and multilingual search to improve performance and increase the scope of the search. Since most existing web search algorithms are originally developed for English web documents, one would question the efficiency and performance of these techniques as they are applied to documents of other languages. In this work, we have chosen Chinese web search and documents for our study. Potential issues and problems in applying well-known English language based algorithms to Chinese web documents are identified and discussed. Through our qualitative and exploratory quantitative analysis, it can be concluded that these algorithms and techniques cannot be directly used to develop an efficient Chinese search engine.