Automatically assessing the post quality in online discussions on software

Authors:
Markus Weimer;Iryna Gurevych;Max Mühlhäuser
Affiliations:
Darmstadt University of Technology, Germany;Darmstadt University of Technology, Germany;Darmstadt University of Technology, Germany
Venue:
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Year:
2007

Citing 6
Cited 22

Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Slash(dot) and burn: distributed moderation in a large online conversation space

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
YALE: rapid prototyping for complex data mining tasks

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Learning to detect conversation focus of threaded discussions

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Automatically assessing review helpfulness

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

Named entity normalization in user generated content

Proceedings of the second workshop on Analytics for noisy unstructured text data
Automatic scoring of online discussion posts

Proceedings of the 2nd ACM workshop on Information credibility on the web
PodCred: a framework for analyzing podcast preference

Proceedings of the 2nd ACM workshop on Information credibility on the web
An Entropy-Based Model for Discovering the Usefulness of Online Product Reviews

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Learning to recognize valuable tags

Proceedings of the 14th international conference on Intelligent user interfaces
Exploiting Surface Features for the Prediction of Podcast Preference

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Answering learners' questions by retrieving question paraphrases from social Q&A sites

EANL '08 Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications
Educational Question Answering based on Social Media Content

Proceedings of the 2009 conference on Artificial Intelligence in Education: Building Learning Systems that Care: From Knowledge Representation to Affective Modelling
How useful are your comments?: analyzing and predicting youtube comments and comment ratings

Proceedings of the 19th international conference on World wide web
Towards the measurement of Arabic Weblogs credibility automatically

Proceedings of the 11th International Conference on Information Integration and Web-based Applications & Services
Automatic summarisation of discussion fora

Natural Language Engineering
Tagging and linking web forum posts

CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Web-based statistical fact checking of textual documents

SMUC '10 Proceedings of the 2nd international workshop on Search and mining user-generated contents
A helpfulness modeling framework for electronic word-of-mouth on consumer opinion platforms

ACM Transactions on Intelligent Systems and Technology (TIST)
Finding deceptive opinion spam by any stretch of the imagination

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
A novel approach for recommending ranked user-generated reviews

AI'10 Proceedings of the 23rd Canadian conference on Advances in Artificial Intelligence
Predicting thread discourse structure over technical web forums

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Opinion helpfulness prediction in the presence of "words of few mouths"

World Wide Web
Credibility-inspired ranking for blog post retrieval

Information Retrieval
Real-time helpfulness prediction based on voter opinions

Concurrency and Computation: Practice & Experience
An intelligent web-based interface for programming content detection in q&a forums

Proceedings of the companion publication of the 2013 international conference on Intelligent user interfaces companion
Capturing programming content in online discussions

Proceedings of the seventh international conference on Knowledge capture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Assessing the quality of user generated content is an important problem for many web forums. While quality is currently assessed manually, we propose an algorithm to assess the quality of forum posts automatically and test it on data provided by Nabble.com. We use state-of-the-art classification techniques and experiment with five feature classes: Surface, Lexical, Syntactic, Forum specific and Similarity features. We achieve an accuracy of 89% on the task of automatically assessing post quality in the software domain using forum specific features. Without forum specific features, we achieve an accuracy of 82%.