An extraction method to get a municipality event information

  • Authors:
  • Tatsuya Ushioda;Shigeru Fujita

  • Affiliations:
  • Graduate School of Information and Computer Science, Chiba Insutitute of Tchnology;Dept.Computer Science, Chiba Insutitute of Technology, chiba, Japan

  • Venue:
  • ICCSA'10 Proceedings of the 2010 international conference on Computational Science and Its Applications - Volume Part IV
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

It is an investigative purpose to acquire information on the event information page that exists in the municipality website in the form of a possible machine process. In this paper, we propose an extraction method from a HTML document based on dictionary.HTML tag is deleted from the HTML document and it converts it into the text. And, it proposes the method for extracting a target character string by comparing the text with the collection of words prepared beforehand. The evaluation experiment was done to the municipality in 23 Tokyo district and 56 Chiba prefecture in Japan. The proposal method was able to extract event information on as a whole 73%. The LR-Wrapper was 52%. The Tree-Wrapper was 55%. The PLR-Wrapper was 32%. The proposal method confirmed event information was rating higher than an existing method extractive by the combination of a simple algorithm and the collection of words.