Determining the Complexity of XML Documents

Authors:
Mustafa H. Qureshi;M. H. Samadzadeh
Affiliations:
PeopleSoft, Inc., Denver, CO;Oklahoma State University, Stillwater, OK
Venue:
ITCC '05 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II - Volume 02
Year:
2005

Citing 0
Cited 4

EXEM: Efficient XML data exchange management for mobile applications

Information Systems Frontiers
Entropy metric for XML DTD documents

ACM SIGSOFT Software Engineering Notes
Analysing complexity of XML schemas in geospatial web services

Proceedings of the 2nd International Conference on Computing for Geospatial Research & Applications
A complete path representation method with a modified inverted index for efficient retrieval of XML documents

WSEAS Transactions on Computers

Quantified Score

Hi-index	0.00

Visualization

Abstract

The eXtensible Markup Language (XML) is a recommendation of the World Wide Web Consortium (W3C). It is a public format and has been widely adopted as a means of interchanging information among computer programs. With XML documents being typically large, we need to have ways of improving their ease of use and maintainability by keeping their complexity low. This research focused on different ways of determining the complexity of XML documents based on various syntactic and structural aspects of these documents. An XML document represents a generic tree. XML documents are pre-order traversal of equivalent XML trees. One of the important findings was that documents with higher nesting levels had more weights and could therefore be viewed as being more complicated as compared to the documents with lower nesting levels. Another important finding was related to Document Type Definitions (DTDs). DTDs can be expressed as regular expressions providing means for calculating quantitative values.