Organizing structured web sources by query schemas: a clustering approach
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Clustering e-commerce search engines based on their search interface pages using WISE-cluster
Data & Knowledge Engineering - Special issue: WIDM 2004
Combining classifiers to identify online databases
Proceedings of the 16th international conference on World Wide Web
Automatic wrapper induction from hidden-web sources with domain knowledge
Proceedings of the 10th ACM workshop on Web information and data management
Hi-index | 0.00 |
The data sources of Deep Web provide tremendous structured data with high quality. However, classifying these interfaces with domain independent is required since the domains of the huge scale of deep web are hard to predefine. In this paper, we propose a novel approach with three-stage to solve this problem. First, we extract both texts and structure of a query interface by applying FIE algorithm we proposed. Then we construct frequent itemsets by using frequent pattern mining algorithm. Finally, we apply AP clustering algorithm to cluster the frequent itemsets according to similarity measure FGSTD presented in this paper. The experiment demonstrates our approach clusters interfaces well with domain independent.