METEOR: metadata and instance extraction from object referral lists on the web

  • Authors:
  • Hasan Davulcu;Srinivas Vadrevu;Saravanakumar Nagarajan;Fatih Gelgi

  • Affiliations:
  • Arizona State University, Tempe, AZ;Arizona State University, Tempe, AZ;Arizona State University, Tempe, AZ;Arizona State University, Tempe, AZ

  • Venue:
  • WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Web has established itself as the largest public data repository ever available. Even though the vast majority of information on the Web is formatted to be easily readable by the human eye, "meaningful information" is still largely inaccessible for the computer applications. In this paper we present the METEOR system which utilizes various presentation and linkage regularities from referral lists of various sorts to automatically separate and extract metadata and instance information. Experimental results for the university domain with 12 computer science department Web sites, comprising 361 individual faculty and course home pages indicate that the performance of the metadata and instance extraction averages 85%, 88% F-measure respectively. METEOR achieves this performance without any domain specific engineering requirement.