Automated information extraction from web APIs documentation

Authors:
Papa Alioune Ly;Carlos Pedrinaci;John Domingue
Affiliations:
Knowledge Media Institute, The Open University, UK,School of Computer and Communication Sciences, École Polytechnique Fédérale de Lausanne (EPFL), UK;Knowledge Media Institute, The Open University, UK;Knowledge Media Institute, The Open University, UK
Venue:
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Year:
2012

Citing 15
Cited 1

Discovering informative content blocks from Web documents

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Architectural styles and the design of network-based software architectures

Architectural styles and the design of network-based software architectures
Automatic Identification of Informative Sections of Web Pages

IEEE Transactions on Knowledge and Data Engineering
Page-level template detection via isotonic smoothing

Proceedings of the 16th international conference on World Wide Web
An Adaptive Scoring Method for Block Importance Learning

WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
SOA Principles of Service Design (The Prentice Hall Service-Oriented Computing Series from Thomas Erl)

SOA Principles of Service Design (The Prentice Hall Service-Oriented Computing Series from Thomas Erl)
SA-REST: Semantically Interoperable and Easier-to-Use Services and Mashups

IEEE Internet Computing
Restful web services

Restful web services
A Faceted Classification Based Approach to Search and Rank Web APIs

ICWS '08 Proceedings of the 2008 IEEE International Conference on Web Services
Web Service Search on Large Scale

ICSOC-ServiceWave '09 Proceedings of the 7th International Joint Conference on Service-Oriented Computing
Web page DOM node characterization and its application to page segmentation

IMSAA'09 Proceedings of the 3rd IEEE international conference on Internet multimedia services architecture and applications
Investigating Web APIs on the World Wide Web

ECOWS '10 Proceedings of the 2010 Eighth IEEE European Conference on Web Services
Block-based similarity search on the web using manifold-ranking

WISE'06 Proceedings of the 7th international conference on Web Information Systems
Repetition-based web page segmentation by detecting tag patterns for small-screen devices

IEEE Transactions on Consumer Electronics
Feature LDA: a supervised topic model for automatic detection of web API documentations from the web

ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I

A framework for self-descriptive RESTful services

Proceedings of the 22nd international conference on World Wide Web companion

Quantified Score

Hi-index	0.00

Visualization

Abstract

A fundamental characteristic of Web APIs is the fact that, de facto, providers hardly follow any standard practices while implementing, publishing, and documenting their APIs. As a consequence, the discovery and use of these services by third parties is significantly hampered. In order to achieve further automation while exploiting Web APIs we present an approach for automatically extracting relevant technical information from the Web pages documenting them. In particular we have devised two algorithms that automatically extract technical details such as operation names, operation descriptions or URI templates from the documentation of Web APIs adopting either RPC or RESTful interfaces. The algorithms devised, which exploit advanced DOM processing as well as state of the art Information Extraction and Natural Language Processing techniques, have been evaluated against a detailed dataset exhibiting a high precision and recall---around 90% for both REST and RPC APIs---outperforming state of the art information extraction algorithms.