An information-pattern-based approach to novelty detection

  • Authors:
  • Xiaoyan Li;W. Bruce Croft

  • Affiliations:
  • Department of Computer Science, Mount Holyoke College, 50 College Street, South Hadley, MA 01075, United States;Department of Computer Science, University of Massachusetts, Amherst, MA 01002, United States

  • Venue:
  • Information Processing and Management: an International Journal
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, a new novelty detection approach based on the identification of sentence level information patterns is proposed. First, ''novelty'' is redefined based on the proposed information patterns, and several different types of information patterns are given corresponding to different types of users' information needs. Second, a thorough analysis of sentence level information patterns is elaborated using data from the TREC novelty tracks, including sentence lengths, named entities (NEs), and sentence level opinion patterns. Finally, a unified information-pattern-based approach to novelty detection (ip-BAND) is presented for both specific NE topics and more general topics. Experiments on novelty detection on data from the TREC 2002, 2003 and 2004 novelty tracks show that the proposed approach significantly improves the performance of novelty detection in terms of precision at top ranks. Future research directions are suggested.