Proposing of modular system for web information extraction

  • Authors:
  • Vojtěch Jirkovský;Ivan Jelínek

  • Affiliations:
  • -;-

  • Venue:
  • CompSysTech '09 Proceedings of the International Conference on Computer Systems and Technologies and Workshop for PhD Students in Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper discusses dimensions and parameters of web information extraction such as level of automatism, type of source document related to its structuring and dependence on extractable domain. In second half we analyze extraction process in more detail according to considered dimensions, propose its phases and discuss modules for each one and interfaces between phases.