An information-pattern-based approach to novelty detection
Information Processing and Management: an International Journal
Novelty as a form of contextual re-ranking: efficient KLD models and mixture models
Proceedings of the second international symposium on Information interaction in context
The effect of smoothing in language models for novelty detection
FDIA'07 Proceedings of the 1st BCS IRSG conference on Future Directions in Information Access
Named entity patterns across news domains
FDIA'07 Proceedings of the 1st BCS IRSG conference on Future Directions in Information Access
Effective sentence retrieval based on query-independent evidence
Information Processing and Management: an International Journal
Hi-index | 0.00 |
The detection of new information in a document stream is an important component of many potential applications. In this thesis, a new novelty detection approach based on the identification of sentence level information patterns is proposed. Given a user's information need, some information patterns in sentences such as combinations of query words, sentence lengths, named entities and phrases, and other sentence patterns, may contain more important and relevant information than single words. The work of the thesis includes three parts. First, we redefine "what is novelty detection" in the lights of the proposed information patterns. Examples of several different types of information patterns are given corresponding to different types of uses' information need. Second, we analyze why the proposed information pattern concept has a significant impact in novelty detection. A thorough analysis of sentence level information patterns is elaborated on data from the TREC novelty tracks, including sentence lengths, named entities (NEs), and sentence level opinion patterns. Finally, we present how we perform novelty detection based on information patterns, which focuses on the identification of previously unseen query-related patterns in sentences. A unified pattern-based approach is presented to novelty detection for both specific NE topics and more general topics. Experiments on novelty detection were carried out on data from the TREC 2002, 2003 and 2004 novelty tracks. Experimental results show that the proposed approach significantly improves the performance of novelty detection for both specific and general topics, therefore the overall performance for all topics, in terms of precision at top ranks. Future research directions are suggested.