HTML2RSS: automatic generation of RSS feed based on structure analysis of HTML document

  • Authors:
  • Tomoyuki Nanno;Manabu Okumura

  • Affiliations:
  • Tokyo Institute of Technology, Yokohama, Kanagawa, JAPAN;Tokyo Institute of Technology, Yokohama, Kanagawa, JAPAN

  • Venue:
  • Proceedings of the 15th international conference on World Wide Web
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a system to automatically generate RSS feeds from HTML documents that consist of time-series items with date expressions, e.g., archives of weblogs, BBSs, chats, mailing lists, site update descriptions, and event announcements. Our system extracts date expressions, performs structure analysis of a HTML document, and detects or generates titles from the document.