A decision-theoretic generalization of on-line learning and an application to boosting
EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
A framework to predict the quality of answers with non-textual features
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Finding high-quality content in social media
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
IEEE Transactions on Knowledge and Data Engineering
Evaluating and predicting answer quality in community QA
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Design lessons from the fastest q&a site in the west
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
ACM Transactions on Information Systems (TOIS)
Analyzing and predicting question quality in community question answering services
Proceedings of the 21st international conference companion on World Wide Web
What makes a good code example?: A study of programming Q&A in StackOverflow
ICSM '12 Proceedings of the 2012 IEEE International Conference on Software Maintenance (ICSM)
Seahawk: stack overflow in the IDE
Proceedings of the 2013 International Conference on Software Engineering
Why, when, and what: analyzing stack overflow questions by topic, type, and code
Proceedings of the 10th Working Conference on Mining Software Repositories
Detecting API usage obstacles: a study of iOS and Android developer questions
Proceedings of the 10th Working Conference on Mining Software Repositories
A study of innovation diffusion through link sharing on stack overflow
Proceedings of the 10th Working Conference on Mining Software Repositories
Building reputation in StackOverflow: an empirical investigation
Proceedings of the 10th Working Conference on Mining Software Repositories
An exploratory analysis of mobile development issues using stack overflow
Proceedings of the 10th Working Conference on Mining Software Repositories
Answering questions about unanswered questions of stack overflow
Proceedings of the 10th Working Conference on Mining Software Repositories
Fit or unfit: analysis and prediction of 'closed questions' on stack overflow
Proceedings of the first ACM conference on Online social networks
Hi-index | 0.00 |
Stack Overflow is the most popular Community based Question Answering (CQA) website for programmers on the web with 2.05M users, 5.1M questions and 9.4M answers. Stack Overflow has explicit, detailed guidelines on how to post questions and an ebullient moderation community. Despite these precise communications and safeguards, questions posted on Stack Overflow can be extremely off topic or very poor in quality. Such questions can be deleted from Stack Overflow at the discretion of experienced community members and moderators. We present the first study of deleted questions on Stack Overflow. We divide our study into two parts - (i) Characterization of deleted questions over ~5 years (2008-2013) of data, (ii) Prediction of deletion at the time of question creation. Our characterization study reveals multiple insights on question deletion phenomena. We find that it takes substantial time to vote a question to be deleted but once voted, the community takes swift action. We also see that question authors delete their questions to salvage reputation points. We notice some instances of accidental deletion of good quality questions but such questions are voted back to be undeleted quickly. We discover a pyramidal structure of question quality on Stack Overflow and find that deleted questions lie at the bottom (lowest quality) of the pyramid. We also build a predictive model to detect the deletion of question at the creation time. We experiment with 47 features -- based on User Profile, Community Generated, Question Content and Syntactic style -- and report an accuracy of 66%. Our findings reveal important suggestions for content quality maintenance on community based question answering websites. To the best of our knowledge, this is the first large scale study on poor quality (deleted) questions on Stack Overflow.