The nature of novelty detection

  • Authors:
  • Le Zhao;Min Zhang;Shaoping Ma

  • Affiliations:
  • State Key Lab of Intelligent Technologies and System, Department of Computer Science and Technology, Tsinghua University, Beijing, P.R. China 100084;State Key Lab of Intelligent Technologies and System, Department of Computer Science and Technology, Tsinghua University, Beijing, P.R. China 100084;State Key Lab of Intelligent Technologies and System, Department of Computer Science and Technology, Tsinghua University, Beijing, P.R. China 100084

  • Venue:
  • Information Retrieval
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sentence level novelty detection aims at spotting sentences with novel information from an ordered sentence list. In the task, sentences appearing later in the list with no new meanings are eliminated. For the task of novelty detection, the contributions of this paper are three-fold. First, conceptually, this paper reveals the computational nature of the task currently overlooked by the Novelty community--Novelty as a combination of partial overlap (PO) and complete overlap (CO) relations between sentences. We define partial overlap between two sentences as a sharing of common facts, while complete overlap is when one sentence covers all of the meanings of the other sentence. Second, technically, a novel approach, the selected pool method is provided which follows naturally from the PO-CO computational structure. We provide formal error analysis for selected pool and methods based on this PO-CO framework. We address the question how accurate must the PO judgments be to outperform the baseline pool method. Third, experimentally, results were presented for all the three novelty datasets currently available. Results show that the selected pool is significantly better or no worse than the current methods, an indication that the term overlap criterion for the PO judgments could be adequately accurate.