Discovering essential code elements in informal documentation

Authors:
Peter C. Rigby;Martin P. Robillard
Affiliations:
Concordia University, Canada;McGill University, Canada
Venue:
Proceedings of the 2013 International Conference on Software Engineering
Year:
2013

Citing 11
Cited 1

Recovering Traceability Links between Code and Documentation

IEEE Transactions on Software Engineering
Recovering documentation-to-source-code traceability links using latent semantic indexing

Proceedings of the 25th International Conference on Software Engineering
Generating Robust Parsers using Island Grammars

WCRE '01 Proceedings of the Eighth Working Conference on Reverse Engineering (WCRE'01)
Building Documentation Generators

ICSM '99 Proceedings of the IEEE International Conference on Software Maintenance
Introduction to Information Retrieval

Introduction to Information Retrieval
Enabling static analysis for partial java programs

Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
Linking e-mails and source code artifacts

Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Design lessons from the fastest q&a site in the west

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Extracting structured data from natural language documents with island parsing

ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
Recovering traceability links between an API and its learning resources

Proceedings of the 34th International Conference on Software Engineering
The Java Language Specification, Java SE 7 Edition

The Java Language Specification, Java SE 7 Edition

Detecting API documentation errors

Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

To access the knowledge contained in developer communication, such as forum posts, it is useful to determine automatically the code elements referred to in the discussions. We propose a novel traceability recovery approach to extract the code elements contained in various documents. As opposed to previous work, our approach does not require an index of code elements to find links, which makes it particularly well-suited for the analysis of informal documentation. When evaluated on 188 StackOverflow answer posts containing 993 code elements, the technique performs with average 0.92 precision and 0.90 recall. As a major refinement on traditional traceability approaches, we also propose to detect which of the code elements in a document are salient, or germane, to the topic of the post. To this end we developed a three-feature decision tree classifier that performs with a precision of 0.65-0.74 and recall of 0.30-0.65, depending on the subject of the document.