Query evaluation techniques for large databases
ACM Computing Surveys (CSUR)
Finding Regular Simple Paths in Graph Databases
SIAM Journal on Computing
Evaluating queries with generalized path expressions
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
A query language and optimization techniques for unstructured data
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Inferring structure in semistructured data
ACM SIGMOD Record
Extracting schema from semistructured data
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Representative Objects: Concise Representations of Semistructured, Hierarchial Data
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Optimizing Regular Path Expressions Using Graph Schemas
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Object Exchange Across Heterogeneous Information Sources
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
ICDT '97 Proceedings of the 6th International Conference on Database Theory
Adding Structure to Unstructured Data
ICDT '97 Proceedings of the 6th International Conference on Database Theory
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
W3QS: A Query System for the World-Wide Web
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Cost-based Selection of Path Expression Processing Algorithms in Object-Oriented Databases
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Knowledge acquisition via incremental conceptual clustering
Knowledge acquisition via incremental conceptual clustering
A New Conceptual Graph Generated Algorithm for Semi-structured Databases
WI '01 Proceedings of the First Asia-Pacific Conference on Web Intelligence: Research and Development
Unordered Tree Mining with Applications to Phylogeny
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Tolerant ad hoc data propagation with error quantification
EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology
Mining schemas in semi-structured data using fuzzy decision trees
ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part IV
Hi-index | 0.00 |
Semi-structured data are typically represented in the form of labeled directed graphs. They are self-describing and schemaless. The lack of a schema renders query processing over semi-structured data expensive. To overcome this predicament, some researchers proposed to use the structure of the data for schema representation. Such schemas are commonly referred to as graph schemas. Nevertheless, since semi-structured data are irregular and frequently subjected to modifications, it is costly to construct an accurate graph schema and worse still, it is difficult to maintain it thereafter. Furthermore, an accurate graph schema is generally very large, hence impractical. In this paper, an approximation approach is proposed for graph schema extraction. Approximation is achieved by summarizing the semi-structured data graph using an incremental clustering method. The preliminary experimental results have shown that approximate graph schemas were more compact than the conventional accurate graph schemas and promising in query evaluation that involved regular path expressions.