Inferring method specifications from natural language API descriptions

Authors:
Rahul Pandita;Xusheng Xiao;Hao Zhong;Tao Xie;Stephen Oney;Amit Paradkar
Affiliations:
North Carolina State University, USA;North Carolina State University, USA;Chinese Academy of Sciences, China;North Carolina State University, USA;CMU, USA;IBM Research, USA
Venue:
Proceedings of the 34th International Conference on Software Engineering
Year:
2012

Citing 33
Cited 7

A general economics model of software reuse

ICSE '92 Proceedings of the 14th international conference on Software engineering
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Automatic generation of program specifications

ISSTA '02 Proceedings of the 2002 ACM SIGSOFT international symposium on Software testing and analysis
Applying "Design by Contract"

Computer
Houdini, an Annotation Assistant for ESC/Java

FME '01 Proceedings of the International Symposium of Formal Methods Europe on Formal Methods for Increasing Software Productivity
Reasoning about inconsistencies in natural language requirements

ACM Transactions on Software Engineering and Methodology (TOSEM)
Feature-rich part-of-speech tagging with a cyclic dependency network

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Accurate unlexicalized parsing

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Software Reuse Research: Status and Future

IEEE Transactions on Software Engineering
Why don't people read the manual?

SIGDOC '06 Proceedings of the 24th annual ACM international conference on Design of communication
Keyword programming in java

Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Parseweb: a programmer assistant for reusing open source code on the web

Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Developing and debugging algebraic specifications for Java classes

ACM Transactions on Software Engineering and Methodology (TOSEM)
DySy: dynamic symbolic execution for invariant inference

Proceedings of the 30th international conference on Software engineering
AMAP: automatically mining abbreviation expansions in programs to enhance software maintenance tools

Proceedings of the 2008 international working conference on Mining software repositories
Automatic documentation inference for exceptions

ISSTA '08 Proceedings of the 2008 international symposium on Software testing and analysis
Discovering Documentation for Java Container Classes

IEEE Transactions on Software Engineering
Developing Application Specific Ontology for Program Comprehension by Combining Domain Ontology with Code Ontology

QSIC '08 Proceedings of the 2008 The Eighth International Conference on Quality Software
Semantics-based code search

ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Improving API documentation usability with knowledge pushing

ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Synthesizing intensional behavior models by graph transformation

ICSE '09 Proceedings of the 31st International Conference on Software Engineering
A comparative study of programmer-written and automatically inferred contracts

Proceedings of the eighteenth international symposium on Software testing and analysis
From program verification to program synthesis

Proceedings of the 37th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Inferring Resource Specifications from Natural Language API Documentation

ASE '09 Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering
Advanced Data Mining Techniques

Advanced Data Mining Techniques
Expect the unexpected: error code mismatches between documentation and the real world

Proceedings of the 9th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering
Text2Test: Automated Inspection of Natural Language Use Cases

ICST '10 Proceedings of the 2010 Third International Conference on Software Testing, Verification and Validation
Automated fixing of programs with contracts

Proceedings of the 19th international symposium on Software testing and analysis
Inferring better contracts

Proceedings of the 33rd International Conference on Software Engineering
Discovering likely method specifications

ICFEM'06 Proceedings of the 8th international conference on Formal Methods and Software Engineering
The spec# programming system: an overview

CASSIS'04 Proceedings of the 2004 international conference on Construction and Analysis of Safe, Secure, and Interoperable Smart Devices
Are practitioners writing contracts?

Rigorous Development of Complex Fault-Tolerant Systems
@tComment: Testing Javadoc Comments to Detect Comment-Code Inconsistencies

ICST '12 Proceedings of the 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation

Automated extraction of security policies from natural-language software documents

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
SmartSynth: synthesizing smartphone automation scripts from natural language

Proceeding of the 11th annual international conference on Mobile systems, applications, and services
Inferring likely mappings between APIs

Proceedings of the 2013 International Conference on Software Engineering
Teaching and learning programming and software engineering via interactive gaming

Proceedings of the 2013 International Conference on Software Engineering
Inferring dependency constraints on parameters for web services

Proceedings of the 22nd international conference on World Wide Web
Detecting API documentation errors

Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
WHYPER: towards automating risk assessment of mobile applications

SEC'13 Proceedings of the 22nd USENIX conference on Security

Quantified Score

Hi-index	0.00

Visualization

Abstract

Application Programming Interface (API) documents are a typical way of describing legal usage of reusable software libraries, thus facilitating software reuse. However, even with such documents, developers often overlook some documents and build software systems that are inconsistent with the legal usage of those libraries. Existing software verification tools require formal specifications (such as code contracts), and therefore cannot directly verify the legal usage described in natural language text of API documents against the code using that library. However, in practice, most libraries do not come with formal specifications, thus hindering tool-based verification. To address this issue, we propose a novel approach to infer formal specifications from natural language text of API documents. Our evaluation results show that our approach achieves an average of 92% precision and 93% recall in identifying sentences that describe code contracts from more than 2500 sentences of API documents. Furthermore, our results show that our approach has an average 83% accuracy in inferring specifications from over 1600 sentences describing code contracts.