Semistructured data and XML

Authors:
Dan Suciu
Affiliations:
AT&T Labs, Florham Park, NJ
Venue:
Information organization and databases
Year:
2000

Citing 40
Cited 1

Principles of database and knowledge-base systems, Vol. I

Principles of database and knowledge-base systems, Vol. I
Communication and concurrency

Communication and concurrency
ILOG: declarative creation and manipulation of object identifiers

Proceedings of the sixteenth international conference on Very large databases
A query language for retrieving information from hierarchical text structures

The Computer Journal - Special issue on information systems
Shortening the OED: experience with a grammar-defined database

ACM Transactions on Information Systems (TOIS)
Comprehension syntax

ACM SIGMOD Record
Text databases: a survey of text models and systems

ACM SIGMOD Record
From structured documents to novel query facilities

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Principles of programming with complex objects and collection types

ICDT '92 Selected papers of the fourth international conference on Database theory
From data representation to data model: meta-semantic issues in the evolution of SGML

Computer Standards & Interfaces - Special issue on SGML into the nineties
A query language and optimization techniques for unstructured data

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
STRUDEL: a Web site management system

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A query language for a Web-site management system

ACM SIGMOD Record
Lore: a database management system for semistructured data

ACM SIGMOD Record
Semistructured data

PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Regular path queries with constraints

PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Query containment for conjunctive queries with regular expressions

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Your mediators need data conversion!

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Catching the boat with Strudel: experiences with a Web-site management system

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Storing semistructured data with STORED

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
A query language for XML

WWW '99 Proceedings of the eighth international conference on World Wide Web
Typechecking for XML transformers

PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
SilkRoute: trading between relations and XML

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Querying the World Wide Web

DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Querying Semistructured Heterogeneous Information

DOOD '95 Proceedings of the Fourth International Conference on Deductive and Object-Oriented Databases
Representative Objects: Concise Representations of Semistructured, Hierarchial Data

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Optimizing Regular Path Expressions Using Graph Schemas

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Object Exchange Across Heterogeneous Information Sources

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Applying a Flexible OODBMS-IRS-Coupling for Structured Document Handling

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Naturally Embedded Query Languages

ICDT '92 Proceedings of the 4th International Conference on Database Theory
Correspondence and Translation for Heterogeneous Data

ICDT '97 Proceedings of the 6th International Conference on Database Theory
Querying Semi-Structured Data

ICDT '97 Proceedings of the 6th International Conference on Database Theory
Adding Structure to Unstructured Data

ICDT '97 Proceedings of the 6th International Conference on Database Theory
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Using Schema Matching to Simplify Heterogeneous Data Translation

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Relational Databases for Querying XML Documents: Limitations and Opportunities

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Object Fusion in Mediator Systems

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Query Decomposition and View Maintenance for Query Languages for Unstructured Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Programming Constructs for Unstructured Data

DBLP-5 Proceedings of the Fifth International Workshop on Database Programming Languages
Structured document storage and refined declarative and navigational access mechanisms in HyperStorM

The VLDB Journal — The International Journal on Very Large Data Bases

Tree model guided candidate generation for mining frequent subtrees from XML documents

ACM Transactions on Knowledge Discovery from Data (TKDD)

Quantified Score

Hi-index	0.02

Visualization

Abstract

XML poses a new set of challenges for semistructured data research. The Extensible Markup Language, XML, is a new recommendation from World Wide Web Consortium that will become a universal data exchange format for the Web. XML shares many common features with semistructured data. Also, it is easy to convert data from virtually any source into XML, which will make it attractive for organizations to "publish" their information sources in XML, and thus make them available to other XML applications on the Web. For such applications to reach their full potential, however, we need to build the right tools to process data in this new format, to perform database operations, like data extraction, data integration, data translation, data storage. Research done so far on semistructured data may offer some solutions, like illustrated by the query language XML-QL. But, as we argue in this paper, XML creates problems which the research on semistructured data has not yet addressed (e.g. type inference), or has not considered important (e.g. distributed evaluation), or simply hasn't solved yet (e.g. storage).