Incorporating quality metrics in centralized/distributed information retrieval on the World Wide Web
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Neural Networks: A Comprehensive Foundation
Neural Networks: A Comprehensive Foundation
Link Contexts in Classifier-Guided Topical Crawlers
IEEE Transactions on Knowledge and Data Engineering
Detecting spam web pages through content analysis
Proceedings of the 15th international conference on World Wide Web
On improving wikipedia search using article quality
Proceedings of the 9th annual ACM international workshop on Web information and data management
Measuring article quality in wikipedia: models and evaluation
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Quality-driven query answering for integrated information systems
Quality-driven query answering for integrated information systems
Proceedings of the International Conference and Workshop on Emerging Trends in Technology
Automated functional testing of online search services
Software Testing, Verification & Reliability
Hi-index | 0.00 |
The World Wide Web is an unregulated communication medium which exhibits very limited means of quality control. Quality assurance has become a key issue for many information retrieval services on the Internet, e.g. web search engines. This paper introduces some quality evaluation and assessment methods to assess the quality of web pages. The proposed quality evaluation mechanisms are based on a set of quality criteria which were extracted from a targeted user survey. A weighted algorithmic interpretation of the most significant user quoted quality criteria is proposed. In addition, the paper utilizes machine learning methods to produce a prediction of quality for web pages before they are downloaded. The set of quality criteria allows us to implement a web search engine with quality ranking schemes, leading to web crawlers which can crawl directly quality web pages. The proposed approaches produce some very promising results on a sizeable web repository.