Recovering Traceability Links between Code and Documentation
IEEE Transactions on Software Engineering
Recovering documentation-to-source-code traceability links using latent semantic indexing
Proceedings of the 25th International Conference on Software Engineering
Generating Robust Parsers using Island Grammars
WCRE '01 Proceedings of the Eighth Working Conference on Reverse Engineering (WCRE'01)
Building Documentation Generators
ICSM '99 Proceedings of the IEEE International Conference on Software Maintenance
Introduction to Information Retrieval
Introduction to Information Retrieval
Enabling static analysis for partial java programs
Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
Linking e-mails and source code artifacts
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Design lessons from the fastest q&a site in the west
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Extracting structured data from natural language documents with island parsing
ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
Recovering traceability links between an API and its learning resources
Proceedings of the 34th International Conference on Software Engineering
The Java Language Specification, Java SE 7 Edition
The Java Language Specification, Java SE 7 Edition
Detecting API documentation errors
Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
Hi-index | 0.00 |
To access the knowledge contained in developer communication, such as forum posts, it is useful to determine automatically the code elements referred to in the discussions. We propose a novel traceability recovery approach to extract the code elements contained in various documents. As opposed to previous work, our approach does not require an index of code elements to find links, which makes it particularly well-suited for the analysis of informal documentation. When evaluated on 188 StackOverflow answer posts containing 993 code elements, the technique performs with average 0.92 precision and 0.90 recall. As a major refinement on traditional traceability approaches, we also propose to detect which of the code elements in a document are salient, or germane, to the topic of the post. To this end we developed a three-feature decision tree classifier that performs with a precision of 0.65-0.74 and recall of 0.30-0.65, depending on the subject of the document.