Mining data records in Web pages
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Fully automatic wrapper generation for search engines
WWW '05 Proceedings of the 14th international conference on World Wide Web
Web data extraction based on partial tree alignment
WWW '05 Proceedings of the 14th international conference on World Wide Web
ViPER: augmenting automatic information extraction with visual perceptions
Proceedings of the 14th ACM international conference on Information and knowledge management
Automatic extraction of dynamic record sections from search engine result pages
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Extracting data records from the web using tag path clustering
Proceedings of the 18th international conference on World wide web
ODE: Ontology-assisted data extraction
ACM Transactions on Database Systems (TODS)
Information extraction for search engines using fast heuristic techniques
Data & Knowledge Engineering
ViDE: A Vision-Based Approach for Deep Web Data Extraction
IEEE Transactions on Knowledge and Data Engineering
WMS-extracting multiple sections data records from search engine results pages
Proceedings of the 2010 ACM Symposium on Applied Computing
Data Extraction for Deep Web Using WordNet
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Hi-index | 0.00 |
Current wrappers are unable to extract multiple sections data records from search engine results pages as sections usually have complicated layout and structure. Extracting data from search engine results pages is important for meta search engine applications and comparative shopping lists evaluation. In this paper, we present a novel data extraction technique which uses visual cue to check for the regularity of structure in multiple sections data records. Our findings show that though there are no regularity in structure for multiple sections data records, there is regularity in structure for multiple sections data records. Our technique is novel and can serve as a model for future multiple sections data extraction and it will be useful for meta search engine application, which needs an accurate tool to locate its source of information.