Information Extraction System Based on Hidden Markov Model

  • Authors:
  • Dong-Chul Park;Vu Thi Huong;Dong-Min Woo;Duong Ngoc Hieu;Sai Thi Ninh

  • Affiliations:
  • Dept. of Information Engineering, Myong Ji University, Korea;Dept. of Information Engineering, Myong Ji University, Korea;Dept. of Information Engineering, Myong Ji University, Korea;Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology, Vietnam;Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology, Vietnam

  • Venue:
  • ISNN '09 Proceedings of the 6th International Symposium on Neural Networks on Advances in Neural Networks
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

A novel approach using Hidden Markov Model (HMM) for the task of finding prices of products on internet sites is proposed in this paper. The proposed Information Extraction System based on HMM (IESHMM) utilizes HMM for its capability to process temporal information. The proposed IESHMM first processes web pages that are returned from search engines and then extracts specific fields such as prices, descriptions, locations, images of products, and other information of interest. The proposed IESHMM is evaluated with real-world problems and compared with a conventional method. The results show that the proposed IESHMM outperforms the other method by 22.9 % and 37.2% in terms of average recall and average precision, respectively.