XML normalization based on entity segments

  • Authors:
  • Xudong Lin;Ning Wang;Xiaoning Zeng;Yanyan Sun

  • Affiliations:
  • -;-;-;-

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2013

Quantified Score

Hi-index 0.07

Visualization

Abstract

Compared with relational data, it is more difficult to normalize XML data. In the relational data model, semantically relevant attributes compose relations which can simplify the normalization issue. But limited by the structural characteristics, the semantic relevancies of XML data cannot be outlined explicitly. Therefore, in the existing XML normalization proposals, XML constraints hold in the unsuitable ranges and cannot authentically match the original information relevancies. In this paper, a kind of semantically relevant information sets- entity segments are used to limit the ranges where XML constraints hold. Based on entity segments, XML constraints are defined as XML attribute dependencies which can authentically reflect the original information relevancies. Simultaneously, entity primary keys are defined as the unique identifiers of entity segments, and the relationships among different entity segments are denoted by the concept of entity foreign key. To form a normalization system for XML schema design, the XML integrity rules and the XML normal form are proposed, the effect of the XML integrity rules and the XML normal form on normalizing XML data is shown by practical instances. And the information-theoretic measure is used to justify their roles further. It is concluded that entity segments are the suitable ranges where XML constraints can authentically match original information relevancies and the proposal presented in this paper is not only effective on avoiding XML data redundancies but also on keeping XML data consistencies.