Web Structure, Dynamics and Page Quality
SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
Cooperation Schemes between a Web Server and a Web Search Engine
LA-WEB '03 Proceedings of the First Conference on Latin American Web Congress
Scheduling Algorithms for Web Crawling
LA-WEBMEDIA '04 Proceedings of the WebMedia & LA-Web 2004 Joint Conference 10th Brazilian Symposium on Multimedia and the Web 2nd Latin American Web Congress
On the image content of a web segment: Chile as a case study
Journal of Web Engineering
Effect of word density on measuring words association
COMPUTE '08 Proceedings of the 1st Bangalore Annual Compute Conference
Efficiently detecting webpage updates using samples
ICWE'07 Proceedings of the 7th international conference on Web engineering
Web site traffic ranking estimation via SVM
ICIC'10 Proceedings of the Advanced intelligent computing theories and applications, and 6th international conference on Intelligent computing
Fixing the threshold for effective detection of near duplicate web documents in web crawling
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
A constrained crawling approach and its application to a specialised search engine
International Journal of Information and Communication Technology
E-FFC: an enhanced form-focused crawler for domain-specific deep web databases
Journal of Intelligent Information Systems
GAT: Platform for automatic context-aware mobile services for m-tourism
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
The key factors for the success of the World Wide Web are its large size and the lack of a centralized control over its contents. Both issues are also the most important source of problems for locating information. The Web is a context in which traditional Information Retrieval methods are challenged, and given the volume of the Web and its speed of change, the coverage of modern search engines is relatively small. Moreover, the distribution of quality is very skewed, and interesting pages are scarce in comparison with the rest of the content.