Multiple sections extraction using visual cue

  • Authors:
  • Derren Wong;Jer Lang Hong

  • Affiliations:
  • School of Computing and IT, Taylor's University, Malaysia;School of Computing and IT, Taylor's University, Malaysia

  • Venue:
  • ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part V
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Current wrappers are unable to extract multiple sections data records from search engine results pages as sections usually have complicated layout and structure. Extracting data from search engine results pages is important for meta search engine applications and comparative shopping lists evaluation. In this paper, we present a novel data extraction technique which uses visual cue to check for the regularity of structure in multiple sections data records. Our findings show that though there are no regularity in structure for multiple sections data records, there is regularity in structure for multiple sections data records. Our technique is novel and can serve as a model for future multiple sections data extraction and it will be useful for meta search engine application, which needs an accurate tool to locate its source of information.