A Web data extraction approach to harvesting data from online sources

Authors:
Richi Nayak;Magnus Haugaasen
Affiliations:
School of Information Systems, Queensland University of Technology, Brisbane, Australia {r.nayak@qut.edu.au, magnus.haugaasen@gmail.com};School of Information Systems, Queensland University of Technology, Brisbane, Australia {r.nayak@qut.edu.au, magnus.haugaasen@gmail.com}
Venue:
Proceedings of the 2006 conference on Advances in Intelligent IT: Active Media Technology 2006
Year:
2006

Citing 3
Cited 0

Monitoring the dynamic web to respond to continuous queries

WWW '03 Proceedings of the 12th international conference on World Wide Web
Effective page refresh policies for Web crawlers

ACM Transactions on Database Systems (TODS)
Spidering Hacks

Spidering Hacks

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the Web becoming a main source of data representation, businesses have opportunities to gather data from various independent web sources and condense these data into specialized services. However, there is no unified structure of web pages and therefore extracting data from sources can be a complex task. We present a solution to locate and extract data from a large group of online bookmaker pages to provide a real-time service to deliver price on sporting events.