Enriching the knowledge sources used in a maximum entropy part-of-speech tagger
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Proceedings of the 28th international conference on Software engineering
A Linguistic Analysis of How People Describe Software Problems
VLHCC '06 Proceedings of the Visual Languages and Human-Centric Computing
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Determining Implementation Expertise from Bug Reports
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
An approach to detecting duplicate bug reports using natural language and execution information
Proceedings of the 30th international conference on Software engineering
Hi-index | 0.00 |
We perform linguistic analysis of bug-report titles obtained from the publicly available Bugzilla defect tracking tool for the open-source Firefox browser (Mozilla project) and present the results of our analysis. Our motivation is to gain insights on how people describe software defects and do a feasibility study on the possibility of building a predictive model (a classifier) for categorizing bug report based only on the titles to one of the predefined severity levels (bug importance). We observed that in general bug titles do not contain enough information for automatically predicting its importance with high accuracy. However, we noticed that two of the bug importance categories such as critical and enhancement have characteristics or features in the title that can be exploited to assign the correct severity level. We perform statistical analysis on part-of-speech, word frequency and distribution across various severity levels.