Emancipating instances from the tyranny of classes in information modeling

  • Authors:
  • Jeffrey Parsons;Yair Wand

  • Affiliations:
  • Memorial Univ. of Newfoundland, St. Johns, Nfld., Canada;Univ. of British Columbia, Vancouver, B.C., Canada

  • Venue:
  • ACM Transactions on Database Systems (TODS)
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Database design commonly assumes, explicitly or implicitly, that instances must belong to classes. This can be termed the assumption of inherent classification. We argue that the extent and complexity of problems in schema integration, schema evolution, and interoperability are, to a large degree, consequences of inherent classification. Furthermore, we make the case that the assumption of inherent classification violates philosophical and cognitive guidelines on classification and is, therefore, inappropriate in view of the role of data modeling in representing knowledge about application domains.As an alternative, we propose a layered approach to modeling in which information about instances is separated from any particular classification. Two data modeling layers are proposed: (1) an instance model consisting of an instance base (i.e., information about instances and properties) and operations to populate, use, and maintain it; and (2) a class model consisting of a class base (i.e., information about classes defined in terms of properties) and operations to populate, use, and maintain it. The two-layered model provides class independence. This is analogous to the arguments of data independence offered by the relational model in comparison to hierarchical and network models. We show that a two-layered approach yields several advantages. In particular, schema integration is shown to be partially an artifact of inherent classification that can be greatly simplified in designing a database based on a layered model; schema evolution is supported without the complexity of operations currently required by class-based models; and the difficulties associated with interoperability among heterogeneous databases are reduced because there is no need to agree on the semantics of classes among independent databases. We conclude by considering the adequacy of a two-layered approach, outlining possible implementation strategies, and drawing attention to some practical considerations.