Determining the Complexity of XML Documents

  • Authors:
  • Mustafa H. Qureshi;M. H. Samadzadeh

  • Affiliations:
  • PeopleSoft, Inc., Denver, CO;Oklahoma State University, Stillwater, OK

  • Venue:
  • ITCC '05 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II - Volume 02
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The eXtensible Markup Language (XML) is a recommendation of the World Wide Web Consortium (W3C). It is a public format and has been widely adopted as a means of interchanging information among computer programs. With XML documents being typically large, we need to have ways of improving their ease of use and maintainability by keeping their complexity low. This research focused on different ways of determining the complexity of XML documents based on various syntactic and structural aspects of these documents. An XML document represents a generic tree. XML documents are pre-order traversal of equivalent XML trees. One of the important findings was that documents with higher nesting levels had more weights and could therefore be viewed as being more complicated as compared to the documents with lower nesting levels. Another important finding was related to Document Type Definitions (DTDs). DTDs can be expressed as regular expressions providing means for calculating quantitative values.