Combining semantic and syntactic document classifiers to improve first story detection

Authors:
Nicola Stokes;Joe Carthy
Affiliations:
Univ. College Dublin, Ireland;Univ. College Dublin, Ireland
Venue:
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2001

Citing 2
Cited 22

Advances in Informational Retrieval: Recent Research from the Center for Intelligent Information Retrieval

Advances in Informational Retrieval: Recent Research from the Center for Intelligent Information Retrieval
Lexical cohesion computed by thesaural relations as an indicator of the structure of text

Computational Linguistics

Novelty and redundancy detection in adaptive filtering

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Retrieval and novelty detection at the sentence level

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Exploiting concept clusters for content-based information retrieval

Information Sciences—Informatics and Computer Science: An International Journal
Using names and topics for new event detection

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Resource-adaptive real-time new event detection

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Hot Topic Extraction Based on Timeline Analysis and Multidimensional Sentence Modeling

IEEE Transactions on Knowledge and Data Engineering
Analyzing feature trajectories for event detection

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
New event detection based on indexing-tree and named entity

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic online news issue construction in web environment

Proceedings of the 17th international conference on World Wide Web
Keyword proximity search in complex data graphs

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Real-time new event detection for video streams

Proceedings of the 17th ACM conference on Information and knowledge management
Automatic online news topic ranking using media focus and user attention based on aging theory

Proceedings of the 17th ACM conference on Information and knowledge management
Topic Detection and Tracking for Threaded Discussion Communities

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
An Automatic Online News Topic Keyphrase Extraction System

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Automatic video tagging using content redundancy

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Online New Event Detection Based on IPLSA

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Bursty topics extraction for web forums

Proceedings of the eleventh international workshop on Web information and data management
New event detection and topic tracking in Turkish

Journal of the American Society for Information Science and Technology
Content redundancy in YouTube and its application to video tagging

ACM Transactions on Information Systems (TOIS)
Indices of novelty for emerging topic detection

Information Processing and Management: an International Journal
A model for anticipatory event detection

ER'06 Proceedings of the 25th international conference on Conceptual Modeling
A on-line news documents clustering method

AMT'12 Proceedings of the 8th international conference on Active Media Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we describe a type of data fusion involving the combination of evidence derived from multiple document representations. Our aim is to investigate if a composite representation can improve the online detection of novel events in a stream of broadcast news stories. This classification process otherwise known as first story detection FSD (or in the Topic Detection and Tracking pilot study as online new event detection [1]), is one of three main classification tasks defined by the TDT initiative. Our composite document representation consists of a semantic representation (based on the lexical chains derived from a text) and a syntactic representation (using proper nouns). Using the TDT1 evaluation methodology, we evaluate a number of document representation combinations using these document classifiers.