World wide web site summarization

  • Authors:
  • Yongzheng Zhang;Nur Zincir-Heywood;Evangelos Milios

  • Affiliations:
  • Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada B3H 1W5;Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada B3H 1W5;Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada B3H 1W5

  • Venue:
  • Web Intelligence and Agent Systems
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Summaries of Web sites help Web users get an idea of the site contents without having to spend time browsing the sites. Currently, manually constructed summaries of Web sites by volunteer experts are available, such as the DMOZ Open Directory Project. This research is directed towards automating the Web site summarization task. To achieve this objective, an approach which applies machine learning and natural language processing techniques is developed to summarize a Web site automatically. The information content of the automatically generated summaries is compared, via a formal evaluation process involving human subjects, to DMOZ summaries, home page browsing and time-limited site browsing, for a number of academic and commercial Web sites. Statistical evaluation of the scores of the answers to a list of questions about the sites demonstrates that the automatically generated summaries convey the same information to the reader as DMOZ summaries do, and more information than the two browsing options.