Efficiently maintaining structural associations of semistructured data

Authors:
Dimitrios Katsaros
Affiliations:
Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
Venue:
PCI'01 Proceedings of the 8th Panhellenic conference on Informatics
Year:
2001

Citing 11
Cited 0

Extracting schema from semistructured data

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Storing semistructured data with STORED

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Borders: An Efficient Algorithm for Association Generation in Dynamic Databases

Journal of Intelligent Information Systems
Quantifying the utility of the past in mining large databases

Information Systems
Querying websites using compact skeletons

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Levelwise Search and Borders of Theories in KnowledgeDiscovery

Data Mining and Knowledge Discovery
Discovering Structural Association of Semistructured Data

IEEE Transactions on Knowledge and Data Engineering
Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Querying Semi-Structured Data

ICDT '97 Proceedings of the 6th International Conference on Database Theory
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Semistructured data arise frequently in the Web or in data integration systems. Semistructured objects describing the same type of information have similar but not identical structure. Finding the common schema of a collection of semistructured objects is a very important task and due to the huge volume of such data encountered, data mining techniques have been employed. Maintenance of the discovered schema in case of updates, i.e., addition of new objects, is also a very important issue. In this paper, we study the problem of maintaining the discovered schema in the case of the addition of new objects. We use the notion of "negative borders" introduced in the context of mining association rules in order to efficiently find the new schema when objects are added to the database. We present experimental results that show the improved efficiency achieved by the proposed algorithm.