A critical investigation of recall and precision as measures of retrieval system performance
ACM Transactions on Information Systems (TOIS)
Learning Information Extraction Rules for Semi-Structured and Free Text
Machine Learning - Special issue on natural language learning
Learning dictionaries for information extraction by multi-level bootstrapping
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
IEPAD: information extraction based on pattern discovery
Proceedings of the 10th international conference on World Wide Web
A Fully Automated Object Extraction System for the World Wide Web
ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
PEWeb: Product Extraction from the Web Based on Entropy Estimation
WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
Hi-index | 0.00 |
A novel approach using Hidden Markov Model (HMM) for the task of finding prices of products on internet sites is proposed in this paper. The proposed Information Extraction System based on HMM (IESHMM) utilizes HMM for its capability to process temporal information. The proposed IESHMM first processes web pages that are returned from search engines and then extracts specific fields such as prices, descriptions, locations, images of products, and other information of interest. The proposed IESHMM is evaluated with real-world problems and compared with a conventional method. The results show that the proposed IESHMM outperforms the other method by 22.9 % and 37.2% in terms of average recall and average precision, respectively.