Information Extraction from Web Pages

Authors:
Róbert Novotny;Peter Vojtas;Duan Maruscak
Affiliations:
-;-;-
Venue:
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Year:
2009

Citing 7
Cited 1

Ontology-based extraction and structuring of information from data-rich unstructured documents

Proceedings of the seventh international conference on Information and knowledge management
A hierarchical approach to wrapper induction

Proceedings of the third annual conference on Autonomous Agents
Wrapper induction: efficiency and expressiveness

Artificial Intelligence - Special issue on Intelligent internet systems
IEPAD: information extraction based on pattern discovery

Proceedings of the 10th international conference on World Wide Web
Visual Web Information Extraction with Lixto

Proceedings of the 27th International Conference on Very Large Data Bases
RoadRunner: Towards Automatic Data Extraction from Large Web Sites

Proceedings of the 27th International Conference on Very Large Data Bases
Mining data records in Web pages

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining

Automatic information extraction from the web: case study with recipes

Proceedings of the 50th Annual Southeast Regional Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a chain of techniques for extraction of object attribute data from web pages which contain either multiple object data or detailed data about a single object. We discover data regions containing multiple data records, which will be extracted with help of extraction ontology. Furthermore, we present an additional algorithm for detail-page extraction based on the comparison of two HTML subtrees.