Classifying the socio-situational settings of transcripts of spoken discourses

Authors:
Yangyang Shi;Pascal Wiggers;Catholijn M. Jonker
Affiliations:
-;-;-
Venue:
Speech Communication
Year:
2013

Citing 20
Cited 0

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
A model for reasoning about persistence and causation

Computational Intelligence
Support-Vector Networks

Machine Learning
Modern Information Retrieval

Modern Information Retrieval
Text genre classification with genre-revealing and subject-revealing features

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Dynamic bayesian networks: representation, inference and learning

Dynamic bayesian networks: representation, inference and learning
Support vector machine active learning with applications to text classification

The Journal of Machine Learning Research
Automatic detection of text genre

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Recognizing text genres with simple metrics using discriminant analysis

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Text genre detection using common word frequencies

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Language modeling with sentence-level mixtures

HLT '94 Proceedings of the workshop on Human Language Technology
Web page genre classification

Proceedings of the 2008 ACM symposium on Applied computing
LIBLINEAR: A Library for Large Linear Classification

The Journal of Machine Learning Research
Pattern Recognition, Fourth Edition

Pattern Recognition, Fourth Edition
Part-of-speech histograms for genre classification of text

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Combining naive bayes and n-gram language models for text classification

ECIR'03 Proceedings of the 25th European conference on IR research
Exploratory analysis of word use and sentence length in the spoken dutch corpus

TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
An analysis of Bayesian classifiers

AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we investigate automatic classification of the socio-situational settings of transcripts of a spoken discourse. Knowledge of the socio-situational setting can be used to search for content recorded in a particular setting or to select context-dependent models for example in speech recognition. The subjective experiment we report on in this paper shows that people correctly classify 68% the socio-situational settings. Based on the cues that participants mentioned in the experiment, we developed two types of automatic socio-situational setting classification methods; a static socio-situational setting classification method using support vector machines (s3c-svm), and a dynamic socio-situational classification method applying dynamic Bayesian networks (s3c-dbn). Using these two methods, we developed classifiers applying various features and combinations of features. The s3c-svm method with sentence length, function word ratio, single occurrence word ratio, part of speech (pos) and words as features results in a classification accuracy of almost 90%. Using a bigram s3c-dbn with pos tag and word features results in a dynamic classifier which can obtain nearly 89% classification accuracy. The dynamic classifiers not only can achieve similar results as the static classifiers, but also can track the socio-situational setting while processing a transcript or conversation. On discourses with a static social situational setting, the dynamic classifiers only need the initial 25% of data to achieve a classification accuracy close to the accuracy achieved when all data of a transcript is used.