Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Automatically assessing the post quality in online discussions on software
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Hi-index | 0.00 |
In this demonstration, we introduce a novel web-based intelligent interface which automatically detects and highlights programming content (programming code and messages) in Q&A programming forums. We expect our interface helps enhancing visual presentation of such forum content and enhance effective participation. We solve this problem using several alternative approaches: a dictionary-based baseline method, a non-sequential Naïve Bayes classification algorithm, and Conditional Random Fields (CRF) which is a sequential labeling framework. The best results are produced by CRF method with an F1-Score of 86.9%. We also experimentally validate how robust our classifier is by testing the constructed CRF model built on a C++ forum against a Python and a Java dataset. The results indicate the classifier works quite well across different domains. To demonstrate detection results, a web-based graphical user interface is developed that accepts a user input programming forum message and processes it using trained CRF model and then displays the programming content snippets in a different font to the user.