Digital neural networks
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
A machine learning based approach for table detection on the web
Proceedings of the 11th international conference on World Wide Web
Semi-Automatic Wrapper Generation for Internet Information Sources
COOPIS '97 Proceedings of the Second IFCIS International Conference on Cooperative Information Systems
A Fully Automated Object Extraction System for the World Wide Web
ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
Wrapper induction for information extraction
Wrapper induction for information extraction
Mining data records in Web pages
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining tables from large scale HTML texts
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
IEEE Transactions on Knowledge and Data Engineering
Hi-index | 0.00 |
This paper discusses a system that learns the structure of Web documents through a backpropagation network and infers the structure of new Web documents. The system first converts Web documents into the input of the backpropagation network through assigning ID to XPath. The learning system of the backpropagation network repeats learning until the error rate goes down below the level specified in the system. After learning, a new Web document is passed through the network, the system infers the structure of the document and extracts information suitable for the structure. The biggest advantages of this system are that there is no human intervention in the learning process and the network is designed to derive the optimal learning result by changing the internal factors and parameters in various ways. When the implemented system was evaluated, the average recall rate was 99.5% and the precision rate was 96.6%, suggesting the satisfactory performance of the system.