A Heuristics-Based Approach to Query Optimization in Structured Document Databases
IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
Searching structured documents
Information Processing and Management: an International Journal
Beyond information searching and browsing: acquiring knowledge from digital libraries
Information Processing and Management: an International Journal - Special issue: An Asian digital libraries perspective
Query optimization in XML structured-document databases
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
The complex internal structure of documents can be described and captured by documentation representation standards such as SGML and SGML related standards like HTML and XML. The hierarchical structure of documents and the attributes of documents as well as attributes of document components at all levels of the document hierarchy can be encoded with markup tags. In traditional text database systems, only queries on content are supported. The rich structural information contained in documents and the attributes of document components are not captured in these systems, and queries on structure and attributes are not supported.We propose a text model, a query language and an indexing scheme which can support queries on content, structure, and attributes of documents as well as attributes of text elements within documents. This model is schema-independent, and query evaluation time is at worst linear. We show that our indexing scheme can efficiently support a wide range of queries in a database for highly heterogeneous collections of structured documents. We provide query examples to show how all the information encoded in documents marked up according to the TEI Guidelines, an encoding standard adopted by the humanities disciplines, can be indexed and queried in our indexing model.