Detecting API documentation errors

  • Authors:
  • Hao Zhong;Zhendong Su

  • Affiliations:
  • Institute of Software, Chinese Academy of Sciences, Beijing, China & University of California, Davis, Davis, CA, USA;University of California, Davis, Davis, CA, USA

  • Venue:
  • Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

When programmers encounter an unfamiliar API library, they often need to refer to its documentations, tutorials, or discussions on development forums to learn its proper usage. These API documents contain valuable information, but may also mislead programmers as they may contain errors (e.g., broken code names and obsolete code samples). Although most API documents are actively maintained and updated, studies show that many new and latent errors do exist. It is tedious and error-prone to find such errors manually as API documents can be enormous with thousands of pages. Existing tools are ineffective in locating documentation errors because traditional natural language (NL) tools do not understand code names and code samples, and traditional code analysis tools do not understand NL sentences. In this paper, we propose the first approach, DOCREF, specifically designed and developed to detect API documentation errors. We formulate a class of inconsistencies to indicate potential documentation errors, and combine NL and code analysis techniques to detect and report such inconsistencies. We have implemented DOCREF and evaluated its effectiveness on the latest documentations of five widely-used API libraries. DOCREF has detected more than 1,000 new documentation errors, which we have reported to the authors. Many of the errors have already been confirmed and fixed, after we reported them.