Statistical models of topical content

Authors:
J. P. Yamron;L. Gillick;P. van Mulbregt;S. Knecht
Affiliations:
formerly of Dragon Systems/Lernout & Hauspie, 320 Nevada Street, Newton, MA;formerly of Dragon Systems/Lernout & Hauspie, 320 Nevada Street, Newton, MA;formerly of Dragon Systems/Lernout & Hauspie, 320 Nevada Street, Newton, MA;Dragon Systems/Lernout & Hauspie, 320 Nevada Street, Newton, MA
Venue:
Topic detection and tracking
Year:
2002

Citing 0
Cited 5

Simple Semantics in Topic Detection and Tracking

Information Retrieval
Relevance models for topic detection and tracking

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Investigating statistical techniques for sentence-level event classification

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Sentence-level event classification in unstructured texts

Information Retrieval
The politics of comments: predicting political orientation of news stories with commenters' sentiment patterns

Proceedings of the ACM 2011 conference on Computer supported cooperative work

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this chapter we explore the behavior of two different statistical models, one based on simple unigrams and another based on the beta-binomial distribution, as applied to the problem of modeling story generation. We describe how these models can be incorporated into information extraction applications, particularly Tracking and Detection engines built for the Topic Detection and Tracking evaluations sponsored by DARPA. Tracking systems based on the two models have complementary strengths and weaknesses: a Beta-Binomial system yields high precision at high decision threshold, but performance quickly degrades as the threshold drops; a Unigram system is not as strong at high decision threshold, but is very good at suppressing false-alarms at lower threshold. We will describe the features of these systems that give rise to this behavior, and discuss ways that each system might be improved by borrowing from the other. We will also discuss our Detection system, and how improvements in Tracking should lead to improvements in Detection.