Structure detection system from web documents through backpropagation network learning

  • Authors:
  • Bok Keun Sun;Je Ryu;Kwang Rok Han

  • Affiliations:
  • Department of Computer Engineering, Hoseo University, Asan City, ChungNam, Korea;Department of Computer Engineering, Hoseo University, Asan City, ChungNam, Korea;Department of Computer Engineering, Hoseo University, Asan City, ChungNam, Korea

  • Venue:
  • AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper discusses a system that learns the structure of Web documents through a backpropagation network and infers the structure of new Web documents. The system first converts Web documents into the input of the backpropagation network through assigning ID to XPath. The learning system of the backpropagation network repeats learning until the error rate goes down below the level specified in the system. After learning, a new Web document is passed through the network, the system infers the structure of the document and extracts information suitable for the structure. The biggest advantages of this system are that there is no human intervention in the learning process and the network is designed to derive the optimal learning result by changing the internal factors and parameters in various ways. When the implemented system was evaluated, the average recall rate was 99.5% and the precision rate was 96.6%, suggesting the satisfactory performance of the system.