An algorithm for suffix stripping
Readings in information retrieval
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Information Retrieval
Comparing Mining Algorithms for Predicting the Severity of a Reported Bug
CSMR '11 Proceedings of the 2011 15th European Conference on Software Maintenance and Reengineering
SEISA: set expansion by iterative similarity aggregation
Proceedings of the 20th international conference on World wide web
Characterizing the usability of interactive applications through query log analysis
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Hi-index | 0.00 |
Online forums provide a channel for users to report and discuss problems related to products and troubleshooting, for faster resolution. These could garner negative publicity if left unattended by the companies. Manually monitoring these massive amounts of discussions is laborious. This paper makes the first attempt at collecting issues that require immediate action by the product supplier by analyzing the immense information on forums. Features that are specific to forum discussions, in conjunction with linguistic cues help in capturing and better prioritizing issues. Any attempt to collect training data for learning a classifier for this task will require enormous labeling effort. Hence, this paper adopts a co-training approach, which uses minimal manual labeling, coupled with linguistic features extracted using a set-expansion algorithm to discover severe problems. Further, most distinct and recent issues are obtained by incorporating a measure of 'centrality', 'diversity' and temporal aspect of the forum threads. We show that this helps in better prioritizing longstanding issues and identify issues that need to be addressed immediately.