Extracting the author of web pages

  • Authors:
  • Yoshikiyo Kato;Daisuke Kawahara;Kentaro Inui;Sadao Kurohashi;Tomohide Shibata

  • Affiliations:
  • National Institute of Information and Communications Technology, Seika, Soraku, Kyoto, Japan;National Institute of Information and Communications Technology, Seika, Soraku, Kyoto, Japan;National Institute of Information and Communications Technology, Seika, Soraku, Kyoto, Japan;National Institute of Information and Communications Technology / Kyoto University, Kyoto, Japan;Kyoto University, Kyoto, Japan

  • Venue:
  • Proceedings of the 2nd ACM workshop on Information credibility on the web
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we define the problem of identifying the author of a Web page as a sub-problem of identifying the information sender configuration of a Web page. We propose a method that extracts the author name candidates from a Web page based on linguistic features, and rank the candidates based on local features such as distance from the main content. The evaluation shows that we can achieve more than 75% precision when evaluated with candidates ranked within top five.