Using statistical testing in the evaluation of retrieval experiments
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Journal of the American Society for Information Science - Special topic issue on the history of documentation and information science: part II
Finding information on the World Wide Web: the retrieval effectiveness of search engines
Information Processing and Management: an International Journal
Analysis of a very large web search engine query log
ACM SIGIR Forum
Real life, real users, and real needs: a study and analysis of user queries on the web
Information Processing and Management: an International Journal
Searching the Web: the public and their queries
Journal of the American Society for Information Science and Technology
A review of web searching studies and a framework for future research
Journal of the American Society for Information Science and Technology
Information Retrieval Experiment
Information Retrieval Experiment
Measuring Search Engine Quality
Information Retrieval
Using graded relevance assessments in IR evaluation
Journal of the American Society for Information Science and Technology
Chinese word segmentation and its effect on information retrieval
Information Processing and Management: an International Journal
Automatic performance evaluation of web search engines
Information Processing and Management: an International Journal
Web search solved?: all result rankings the same?
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Hi-index | 0.00 |
The objective of this study is to develop a measurement of search result relevance for Chinese queries through comparing four Chinese search engines (A, B, C, D). The relevance measurement was First N method and statistical test. By blind evaluating of first 10 search results, four indexes such as average precisions within first n results (P @ n), hit rate within n results (H @ n), mean dead link rate within n results (MD @ n) and mean reciprocal rank of first relevant document (MRR1 @ n) were figured out. The results implied that except for MD @ n engine C was better, the other three indexes engine A were the best. However, by statistical analyzing, it indicated that there were no significant difference of the P @ n, H @ n and MRR1 @ n among the four engines except for the index MD @ n.