A learning based model for headline extraction of news articles to find explanatory sentences for events

  • Authors:
  • Sandip Debnath;C. Lee Giles

  • Affiliations:
  • Penn State University, University Park, PA;Penn State University, University Park, PA

  • Venue:
  • Proceedings of the 3rd international conference on Knowledge capture
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Metadata information plays a crucial role in augmenting document organising efficiency and archivability. News metadata includes DateLine, ByLine, HeadLine and many others. We found that HeadLine information is useful for guessing the theme of the news article. Particularly for financial news articles, we found that HeadLine can thus be specially helpful to locate explanatory sentences for any major events such as significant changes in stock prices. In this paper we explore a support vector based learning approach to automatically extract the HeadLine metadata. We find that the classification accuracy of finding the HeadLines improves if DateLines are identified first. We then used the extracted HeadLines to initiate a pattern matching of keywords to find the sentences responsible for story theme. Using this theme and a simple language model it is possible to locate any explanatory sentences for any significant price change.