PriSM: discovering and prioritizing severe technical issues from product discussion forums

Authors:
Rashmi Gangadharaiah;Rose Catherine
Affiliations:
IBM Research India, Bangalore, India;IBM Research India, Bangalore, India
Venue:
Proceedings of the 21st ACM international conference on Information and knowledge management
Year:
2012

Citing 6
Cited 0

An algorithm for suffix stripping

Readings in information retrieval
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Information Retrieval

Information Retrieval
Comparing Mining Algorithms for Predicting the Severity of a Reported Bug

CSMR '11 Proceedings of the 2011 15th European Conference on Software Maintenance and Reengineering
SEISA: set expansion by iterative similarity aggregation

Proceedings of the 20th international conference on World wide web
Characterizing the usability of interactive applications through query log analysis

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Online forums provide a channel for users to report and discuss problems related to products and troubleshooting, for faster resolution. These could garner negative publicity if left unattended by the companies. Manually monitoring these massive amounts of discussions is laborious. This paper makes the first attempt at collecting issues that require immediate action by the product supplier by analyzing the immense information on forums. Features that are specific to forum discussions, in conjunction with linguistic cues help in capturing and better prioritizing issues. Any attempt to collect training data for learning a classifier for this task will require enormous labeling effort. Hence, this paper adopts a co-training approach, which uses minimal manual labeling, coupled with linguistic features extracted using a set-expansion algorithm to discover severe problems. Further, most distinct and recent issues are obtained by incorporating a measure of 'centrality', 'diversity' and temporal aspect of the forum threads. We show that this helps in better prioritizing longstanding issues and identify issues that need to be addressed immediately.