Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL
EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Towards the self-annotating web
Proceedings of the 13th international conference on World Wide Web
POLYPHONET: an advanced social network extraction system from the web
Proceedings of the 15th international conference on World Wide Web
The Google Similarity Distance
IEEE Transactions on Knowledge and Data Engineering
Extracting accurate and complete results from search engines: Case study windows live
Journal of the American Society for Information Science and Technology
Quantitative comparisons of search engine results
Journal of the American Society for Information Science and Technology
Using Semantic Distances for Reasoning with Inconsistent Ontologies
ISWC '08 Proceedings of the 7th International Conference on The Semantic Web
Investigation of the accuracy of search engine hit counts
Journal of Information Science
Automatic keyword prediction using Google similarity distance
Expert Systems with Applications: An International Journal
ICWE'10 Proceedings of the 10th international conference on Current trends in web engineering
Hi-index | 0.00 |
Recently, there have been numerous studies that rely on the number of search results, i.e., hit count. However, hit counts returned by search engines can vary unnaturally when observed on different days, and may contain large errors that affect researches that depend on those results. Such errors can result in low precision of machine translation, incorrect extraction of synonyms and other problems. Thus, it is indispensable to evaluate and to improve the reliability of hit counts. There exist several researches to show the phenomenon; however, none of previous researches have made clear how much we can trust them. In this paper, we propose hit counts' reliability metrics to quantitatively evaluate hit counts' reliability to improve hit count selection. The evaluation results with Google show that our metrics successfully adopt reliable hit counts --- 99.8% precision, and skip to adopt unreliable hit counts --- 74.3% precision.