Incremental Mining of Schema for Semistructured Data

Authors:
Aoying Zhou;Wen Jin;Shuigeng Zhou;Zengping Tian
Affiliations:
-;-;-;-
Venue:
PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
Year:
1999

Citing 11
Cited 0

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Advances in knowledge discovery and data mining

Advances in knowledge discovery and data mining
Querying the World Wide Web

DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Data Mining: An Overview from a Database Perspective

IEEE Transactions on Knowledge and Data Engineering
Object Exchange Across Heterogeneous Information Sources

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
WebOQL: Restructuring Documents, Databases, and Webs

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Generalized Association Rules

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Discovery of Spatial Association Rules in Geographic Information Databases

SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
A Declarative Language for Querying and Restructuring the Web

RIDE '96 Proceedings of the 6th International Workshop on Research Issues in Data Engineering (RIDE '96) Interoperability of Nontraditional Database Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Semistructured data is specified by the lack of any fixed and rigid schema, even though typically some implicit structure appears in the data. The huge amounts of on-line applications make it important and imperative to mine schema of semistructured data, both for the users (e.g., to gather useful information and facilitate querying) and for the systems (e.g., to optimize access). The critical problem is to discover the implicit structure in the semistructured data. Current methods in extracting Web data structure are either in a general way independent of application background [8], [9], or bound in some concrete environment such as HTML etc [13], [14], [15]. But both face the burden of expensive cost and difficulty in keeping along with the frequent and complicated variances of Web data. In this paper, we first deal with the problem of incremental mining of schema for semistructured data after the update of the raw data. An algorithm for incrementally mining schema of semistructured data is provided, and some experimental results are also given, which shows that our incremental mining for semistructured data is more efficient than non-incremental mining.