Structured databases on the web: observations and implications
ACM SIGMOD Record
Characterization of national Web domains
ACM Transactions on Internet Technology (TOIT)
Communications of the ACM - ACM at sixty: a look back in time
Host-IP clustering technique for deep web characterization
Proceedings of the 2010 ACM Symposium on Applied Computing
Understanding deep web search interfaces: a survey
ACM SIGMOD Record
On building a search interface discovery system
RED'09 Proceedings of the 2nd international conference on Resource discovery
Sampling the national deep web
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
Databases on the web: national web domain survey
Proceedings of the 15th Symposium on International Database Engineering & Applications
Hi-index | 0.00 |
With the advances in web technologies, more and more information on the Web is contained in dynamically-generated web pages. Among several types of web "dynamism" the most important one is the case when web pages are generated as results of queries submitted via search web forms to databases available online. These pages constitute the portion of the Web known as deep Web. The existing estimates of the deep Web are predominantly based on study of English deep web sites. The key parameters of other-than-English segments of the deep Web were not investigated so far. Thus, currently known characteristics of the deep Web may be biased, especially owing to a steady increase in non-English web content. In this paper, we survey the part of the deep Web consisting of dynamic pages in one particular national domain. The estimation of the national deep Web is performed using the proposed sampling techniques. We report our observations and findings based on the experiments conducted in summer 2005.