Complete answer aggregates for treelike databases: a novel approach to combine querying and navigation

Authors:
Holger Meuss;Klaus U. Schulz
Affiliations:
Univ., of Munich;Univ., of Munich
Venue:
ACM Transactions on Information Systems (TOIS)
Year:
2001

Citing 33
Cited 9

The SGML handbook

The SGML handbook
Information retrieval: data structures and algorithms

Information retrieval: data structures and algorithms
An algebra for hierarchically organized text-dominated databases

Information Processing and Management: an International Journal
Retrieval from hierarchical texts by partial patterns

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Text databases: a survey of text models and systems

ACM SIGMOD Record
From structured documents to novel query facilities

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
A query language and optimization techniques for unstructured data

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Query by templates: a generalized approach for visual query formulation for text dominated databases

IEEE ADL '97 Proceedings of the IEEE international forum on Research and technology advances in digital libraries
Lore: a database management system for semistructured data

ACM SIGMOD Record
Proximal nodes: a model to query document databases by content and structure

ACM Transactions on Information Systems (TOIS)
Semistructured data

PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Your mediators need data conversion!

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Inverted files versus signature files for text indexing

ACM Transactions on Database Systems (TODS)
Database techniques for the World-Wide Web: a survey

ACM SIGMOD Record
GraphLog: a visual formalism for real life recursion

PODS '90 Proceedings of the ninth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Algebras for querying text regions: expressive power and optimization

Journal of Computer and System Sciences - Fourteenth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems
An overview of semistructured data

ACM SIGACT News
XML-GL: a graphical language for querying and restructuring XML documents

WWW '99 Proceedings of the eighth international conference on World Wide Web
Pattern Matching in Trees

Journal of the ACM (JACM)
Managing gigabytes (2nd ed.): compressing and indexing documents and images

Managing gigabytes (2nd ed.): compressing and indexing documents and images
Data on the Web: from relations to semistructured data and XML

Data on the Web: from relations to semistructured data and XML
Regular path queries with constraints

Journal of Computer and System Sciences
Comparative analysis of five XML query languages

ACM SIGMOD Record
Integrating contents and structure in text retrieval

ACM SIGMOD Record
Modern Information Retrieval

Modern Information Retrieval
Querying Semi-Structured Data

ICDT '97 Proceedings of the 6th International Conference on Database Theory
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Mind Your Grammar: a New Approach to Modelling Text

VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
Interactive Query and Search in Semistructured Databases

WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Quilt: An XML Query Language for Heterogeneous Data Sources

Selected papers from the Third International Workshop WebDB 2000 on The World Wide Web and Databases
BBQ: A Visual Interface for Integrated Browsing and Querying of XML

VDB 5 Proceedings of the Fifth Working Conference on Visual Database Systems: Advances in Visual Information Management
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Improving index structures for structured document retrieval

IRSG'99 Proceedings of the 21st Annual BCS-IRSG conference on Information Retrieval Research

XQL and proximal nodes

Journal of the American Society for Information Science and Technology - XML
Conjunctive queries over trees

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Conjunctive queries over trees

Journal of the ACM (JACM)
Efficient evaluation of n-ary conjunctive queries over trees and graphs

WIDM '06 Proceedings of the 8th annual ACM international workshop on Web information and data management
Supporting multiple paths to objects in information hierarchies: Faceted classification, faceted search, and symbolic links

Information Processing and Management: an International Journal
Four lessons in versatility or how query languages adapt to the web

Semantic techniques for the web
Web and semantic web query languages: a survey

Proceedings of the First international conference on Reasoning Web
Ranked retrieval of structured documents with the s-term vector space model

INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval
Enhancing user interaction and efficiency with structural summaries for fast and intuitive access to XML databases

EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

The use of markup languages like SGML, HTML or XML for encoding the strucutre of documents or linguistic data has lead to many databases where entries are adequately described as trees. In this context querying formalisms are interesting that offer the possiblity to refer both to textual content and logical structure. We consider models where the strucutre specified in a query is not only used as a filter, but also for selecting and presenting different parts of the data. If answers are formalized as mapping from query nodes to the database, a simple enumeration of all mappings in the answer set will often suffer from the effect that many answers have common subparts. From a theoretical point of view this may lead to an exponential time complexity of the computation and presentation of all answers. Concentration on the language of so called tree queries—a variant and extension of Kilpeläinen's Tree Matching formalism—we introduce the notion of a “complete answer aggregate” for a given query. This new data strucutre offers a compact view of the set of all answer and supports active exploration of the ansewer space. Since complete answer aggregates use a powerful structure-sharing mechanism their maximal size is of order &sgr;(d•h•q) where d and q respectively denote the size of the database and the query, and h is the maximal depth of a path of the database. An algorithm is given that computes a complete answer aggregate for a given treee query in time &sgr;(d•log(d)•h•). For the sublanguage of so-called rigid tree queries, as well as for so-called “nonrecursive” databases, an improved bound of :&sgr;(d•log(d)•q) is obtained. The algorithm is based on a specific index structure that supports practical efficiency.