Inferring specifications for resources from natural language API documentation

  • Authors:
  • Hao Zhong;Lu Zhang;Tao Xie;Hong Mei

  • Affiliations:
  • Laboratory for Internet Software Technologies, Institute of Software, Chinese Academy of Sciences, Beijing, China;School of Electronics Engineering and Computer Science, Peking University, Beijing, China and The Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education ...;Department of Computer Science, North Carolina State University, Raleigh, USA;School of Electronics Engineering and Computer Science, Peking University, Beijing, China and The Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education ...

  • Venue:
  • Automated Software Engineering
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many software libraries, especially those commercial ones, provide API documentation in natural languages to describe correct API usages. However, developers may still write code that is inconsistent with API documentation, partially because many developers are reluctant to carefully read API documentation as shown by existing research. As these inconsistencies may indicate defects, researchers have proposed various detection approaches, and these approaches need many known specifications. As it is tedious to write specifications manually for all APIs, various approaches have been proposed to mine specifications automatically. In the literature, most existing mining approaches rely on analyzing client code, so these mining approaches would fail to mine specifications when client code is not sufficient. Instead of analyzing client code, we propose an approach, called Doc2Spec, that infers resource specifications from API documentation in natural languages. We evaluated our approach on the Javadocs of five libraries. The results show that our approach performs well on real scale libraries, and infers various specifications with relatively high precisions, recalls, and F-scores. We further used inferred specifications to detect defects in open source projects. The results show that specifications inferred by Doc2Spec are useful to detect real defects in existing projects.