Toward mining "concept keywords" from identifiers in large software projects
MSR '05 Proceedings of the 2005 international workshop on Mining software repositories
Hi-index | 0.00 |
Program comprehension of legacy systems is a highly knowledge intensive task. One of the goal of reverse engineering is to propose automated help to relate application domain concepts to all their implementation instances. It is generally accepted that to do so would require analyzing such documentation as identifiers or comments. However, before attempting to perform this difficult analysis, it would be useful to know precisely what information the documentation contains and if it is worth trying.We present here the results of a study of the knowledge contained in two sources of documentation for the Mosaic system. This knowledge is categorized in various domains and the relative proportion of these domains is discussed. Among other things, the results highlight the high frequency with which application domain concepts are used, which could provide the means to identify them.