Off-topic detection in conversational telephone speech

Authors:
Robin Stewart;Andrea Danyluk;Yang Liu
Affiliations:
Williams College, Williamstown, MA;Williams College, Williamstown, MA;UT Dallas, Richardson, TX
Venue:
ACTS '09 Proceedings of the HLT-NAACL 2006 Workshop on Analyzing Conversations in Text and Speech
Year:
2006

Citing 6
Cited 0

A sequential algorithm for training text classifiers

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Statistical Models for Text Segmentation

Machine Learning - Special issue on natural language learning
TextTiling: segmenting text into multi-paragraph subtopic passages

Computational Linguistics
A simple rule-based part of speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
Statistical models for topic segmentation

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In a context where information retrieval is extended to spoken "documents" including conversations, it will be important to provide users with the ability to seek informational content, rather than socially motivated small talk that appears in many conversational sources. In this paper we present a preliminary study aimed at automatically identifying "irrelevance" in the domain of telephone conversations. We apply a standard machine learning algorithm to build a classifier that detects off-topic sections with better-than-chance accuracy and that begins to provide insight into the relative importance of features for identifying utterances as on topic or not.