A method for automating the extraction of specialized information from the web

  • Authors:
  • Ling Lin;Antonio Liotta;Andrew Hippisley

  • Affiliations:
  • Department of Electronic Systems Engineering, University of Essex, Colchester, UK;Department of Electronic Systems Engineering, University of Essex, Colchester, UK;Department of Computing, University of Surrey, UK

  • Venue:
  • CIS'05 Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part I
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The World Wide Web can be viewed as a gigantic distributed database including millions of interconnected hosts some of which publish information via web servers or peer-to-peer systems. We present here a novel method for the extraction of semantically rich information from the web in a fully automated fashion. We illustrate our approach via a proof-of-concept application which scrutinizes millions of web pages looking for clues as to the trend of the Chinese stock market. We present the outcomes of a 210-day long study which indicates a strong correlation between the information retrieved by our prototype and the actual market behavior.