Unsupervised topic modelling for multi-party spoken discourse

Authors:
Matthew Purver;Thomas L. Griffiths;Konrad P. Körding;Joshua B. Tenenbaum
Affiliations:
Stanford University, Stanford, CA;Brown University, Providence, RI;Massachusetts Institute of Technology, Cambridge, MA;Massachusetts Institute of Technology, Cambridge, MA
Venue:
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Year:
2006

Citing 11
Cited 26

Statistical Models for Text Segmentation

Machine Learning - Special issue on natural language learning
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Topic segmentation with an aspect hidden Markov model

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A critique and improvement of an evaluation metric for text segmentation

Computational Linguistics
Improved Topic Discrimination of Broadcast News Using a Model of Multiple Simultaneous Topics

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Latent dirichlet allocation

The Journal of Machine Learning Research
Multi-paragraph segmentation of expository text

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Segmentation and Classification of Meeting Events using Multiple Classifier Fusion and Dynamic Programming

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Statistical models for topic segmentation

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Discourse segmentation of multi-party conversation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
The necessity of a meeting recording and playback system, and the benefit of topic–level annotations to meeting browsing

INTERACT'05 Proceedings of the 2005 IFIP TC13 international conference on Human-Computer Interaction

Segmenting meetings into agenda items by extracting implicit supervision from human note-taking

Proceedings of the 12th international conference on Intelligent user interfaces
Modeling online reviews with multi-grain topic models

Proceedings of the 17th international conference on World Wide Web
Meeting adjourned: off-line learning interfaces for automatic meeting understanding

Proceedings of the 13th international conference on Intelligent user interfaces
HCI Beyond the GUI: Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces

HCI Beyond the GUI: Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces
Inferring tutorial dialogue structure with hidden Markov modeling

EdAppsNLP '09 Proceedings of the Fourth Workshop on Innovative Use of NLP for Building Educational Applications
Bayesian unsupervised topic segmentation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hierarchical text segmentation from multi-scale lexical cohesion

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Global models of document structure using latent permutations

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Learning document-level semantic properties from free-text annotations

Journal of Artificial Intelligence Research
Classification of patient case discussions through analysis of vocalisation graphs

Proceedings of the 2009 international conference on Multimodal interfaces
Dialogue segmentation with large numbers of volunteer internet annotators

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Participant subjectivity and involvement as a basis for discourse segmentation

SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Content modeling using latent permutations

Journal of Artificial Intelligence Research
A statistical model for topic segmentation and clustering

Canadian AI'08 Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on Advances in artificial intelligence
The CALO meeting assistant system

IEEE Transactions on Audio, Speech, and Language Processing
Holistic sentiment analysis across languages: multilingual supervised latent Dirichlet allocation

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Exploiting conversation structure in unsupervised topic segmentation for emails

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Multi-document topic segmentation

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Dialogue act modeling in a complex task-oriented domain

SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Discovering K web user groups with specific aspect interests

MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
Discourse structure and computation: past, present and future

ACL '12 Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries
SITS: a hierarchical nonparametric model using speaker identity for topic segmentation in multiparty conversations

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Influence relation estimation based on lexical entrainment in conversation

Speech Communication
Discourse structure and language technology

Natural Language Engineering
On handling textual errors in latent document modeling

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Topic segmentation and labeling in asynchronous conversations

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a method for unsupervised topic modelling which adapts methods used in document classification (Blei et al., 2003; Griffiths and Steyvers, 2004) to unsegmented multi-party discourse transcripts. We show how Bayesian inference in this generative model can be used to simultaneously address the problems of topic segmentation and topic identification: automatically segmenting multi-party meetings into topically coherent segments with performance which compares well with previous unsupervised segmentation-only methods (Galley et al., 2003) while simultaneously extracting topics which rate highly when assessed for coherence by human judges. We also show that this method appears robust in the face of off-topic dialogue and speech recognition errors.