A method for automating the extraction of specialized information from the web

Authors:
Ling Lin;Antonio Liotta;Andrew Hippisley
Affiliations:
Department of Electronic Systems Engineering, University of Essex, Colchester, UK;Department of Electronic Systems Engineering, University of Essex, Colchester, UK;Department of Computing, University of Surrey, UK
Venue:
CIS'05 Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part I
Year:
2005

Citing 1
Cited 1

A study on word-based and integral-bit Chinese text compression algorithms

Journal of the American Society for Information Science

Buy, sell, or hold? information extraction from stock analyst reports

CONTEXT'11 Proceedings of the 7th international and interdisciplinary conference on Modeling and using context

Quantified Score

Hi-index	0.00

Visualization

Abstract

The World Wide Web can be viewed as a gigantic distributed database including millions of interconnected hosts some of which publish information via web servers or peer-to-peer systems. We present here a novel method for the extraction of semantically rich information from the web in a fully automated fashion. We illustrate our approach via a proof-of-concept application which scrutinizes millions of web pages looking for clues as to the trend of the Chinese stock market. We present the outcomes of a 210-day long study which indicates a strong correlation between the information retrieved by our prototype and the actual market behavior.