Towards a SOA infrastructure for statistically analysing public health data

Authors:
Pierpaolo Vittorini;Stefano Necozione;Ferdinando di Orio
Affiliations:
University of L'Aquila, Coppito, ITALY;University of L'Aquila, Coppito, ITALY;University of L'Aquila, Coppito, ITALY
Venue:
Proceedings of the ACM first workshop on CyberInfrastructure: information management in eScience
Year:
2007

Citing 8
Cited 0

Reuse-based software engineering: techniques, organization, and controls

Reuse-based software engineering: techniques, organization, and controls
eXist: An Open Source Native XML Database

Revised Papers from the NODe 2002 Web and Database-Related Workshops on Web, Web-Services, and Database Systems
What are Web services?

Communications of the ACM - E-services: a cornucopia of digital offerings ushers in the next Net-based evolution
Web Services Platform Architecture: SOAP, WSDL, WS-Policy, WS-Addressing, WS-BPEL, WS-Reliable Messaging and More

Web Services Platform Architecture: SOAP, WSDL, WS-Policy, WS-Addressing, WS-BPEL, WS-Reliable Messaging and More
Bridging the gap between OLAP and SQL

VLDB '05 Proceedings of the 31st international conference on Very large data bases
MonetDB/XQuery: a fast XQuery processor powered by a relational engine

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Biomedical Informatics: Computer Applications in Health Care and Biomedicine (Health Informatics)

Biomedical Informatics: Computer Applications in Health Care and Biomedicine (Health Informatics)
Data management in medicine: the EPIweb information system, a case study and some open issues

EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

To respond to the need for interoperable information systems in public health, several proposals based on XML-related technologies are currently available. For instance, the CDA [8] is an architecture developed by the HL7 organization for representing and managing clinical documents, while the PHIN [12] is a CDC infrastructure whose aim is to automatically exchange XML data between public health partners through ebXML compliant SOAP web services [16, 11]. Despite the large efforts spent in developing standards and infrastructures -- though not conclusive -- useful to achieve more effective interoperability among public health information systems, to the best of our knowledge, there are no researches produced so far to statistically analyse biomedical data represented as XML documents. Among the languages which can query XML documents, XPath can perform only basic statistics (e.g. mean, minimum, maximum [13]), while it is known that high-level tests are mandatory for every common analysis. Thus, the sole current possibility is to convert the data stored into such documents into a tabular format, and to use a standard statistical package to perform the analysis. To overcome this limitation, the paper proposes a complete SOA infrastructure, with major concern with the following components: (i) the XFNSE web-service which contains a list of operations implementing the statistical analyses reported in [6]; (ii) a prototype of a service consumer -- called JXFNSE -- which uses the operations exposed by XFNSE to statistically analyse a dataset; (iii) an eXist [14] module which extends XPath by adding a list of functions "tracing" the XFNSE operations. Finally, by computing several performances, the authors discuss the drawbacks of using services while analysing large datasets, and show a possible improvement in terms of a caching mechanism.