Automatic Extraction of Publication Time from News Search Results

  • Authors:
  • Yiyao Lu;Weiyi Meng;Wanjing Zhang;King-Lup Liu;Clement Yu

  • Affiliations:
  • SUNY at Binghamton;SUNY at Binghamton;SUNY at Binghamton;Webscalers, LLC;U. of Illinois at Chicago

  • Venue:
  • ICDEW '06 Proceedings of the 22nd International Conference on Data Engineering Workshops
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The publication time of a page can have a big impact on its relevance to a query, especially for time-sensitive pages such as news items. For news search engines, the publication time of news items can usually be found in the returned search result records. In this paper, we introduce a method that can automatically extract the publication time for each news story returned from news search engines based on several important observations we made. We also introduce a wrapper implementation for the extraction method. The experimental results using data collected from 50 news search engine show that our method is effective and the wrapper implementation can not only improve the extraction accuracy but also the extraction efficiency.