Checking determinism of XML Schema content models in optimal time

  • Authors:
  • Pekka Kilpeläinen

  • Affiliations:
  • University of Eastern Finland, School of Computing, Kuopio, Finland

  • Venue:
  • Information Systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the determinism checking of XML Schema content models, as required by the W3C Recommendation. We argue that currently applied solutions have flaws and make processors vulnerable to exponential resource needs by pathological schemas, and we help to eliminate this potential vulnerability of XML Schema based systems. XML Schema content models are essentially regular expressions extended with numeric occurrence indicators. A previously published polynomial-time solution to check the determinism of such expressions is improved to run in linear time, and the improved algorithm is implemented and evaluated experimentally. When compared to the corresponding method of a popular production-quality XML Schema processor, the new implementation runs orders of magnitude faster. Enhancing the solution to take further extensions of XML Schema into account without compromising its linear scalability is also discussed.