Principles of database and knowledge-base systems, Vol. I
Principles of database and knowledge-base systems, Vol. I
Communication and concurrency
A query language and optimization techniques for unstructured data
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Analysis of a local search heuristic for facility location problems
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Foundations of Databases: The Logical Level
Foundations of Databases: The Logical Level
Object Database Standard: ODMG-93
Object Database Standard: ODMG-93
Querying Semistructured Heterogeneous Information
DOOD '95 Proceedings of the Fourth International Conference on Deductive and Object-Oriented Databases
Representative Objects: Concise Representations of Semistructured, Hierarchial Data
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
ICDT '97 Proceedings of the 6th International Conference on Database Theory
Adding Structure to Unstructured Data
ICDT '97 Proceedings of the 6th International Conference on Database Theory
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
An Object Data Model with Roles
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Query Decomposition and View Maintenance for Query Languages for Unstructured Data
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Queries with incomplete answers over semistructured data
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Storing semistructured data with STORED
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Controlled access and dissemination of XML documents
Proceedings of the 2nd international workshop on Web information and data management
Data mining and the Web: past, present and future
Proceedings of the 2nd international workshop on Web information and data management
DTD inference for views of XML data
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
XTRACT: a system for extracting document type descriptors from XML documents
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Learning to extract hierarchical information from semi-structured documents
Proceedings of the ninth international conference on Information and knowledge management
Proceedings of the 3rd ACM international workshop on Data warehousing and OLAP
ACM SIGKDD Explorations Newsletter
Querying websites using compact skeletons
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Induction of integrated view for XML data with heterogeneous DTDs
Proceedings of the tenth international conference on Information and knowledge management
A performance evaluation of storing XML data in relational database management systems
Proceedings of the 3rd international workshop on Web information and data management
Entity-oriented queries for semistructured data
Information organization and databases
XClust: clustering XML schemas for effective integration
Proceedings of the eleventh international conference on Information and knowledge management
XTRACT: Learning Document Type Descriptors from XML Document Collections
Data Mining and Knowledge Discovery
PIPE: Web Personalization by Partial Evaluation
IEEE Internet Computing
A Database Approach for Modeling and Querying Video Data
IEEE Transactions on Knowledge and Data Engineering
On Bounding-Schemas for LDAP Directories
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Approximate Graph Schema Extraction for Semi-Structured Data
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Interactive Query Formulation in Semistructured Databases
FQAS '02 Proceedings of the 5th International Conference on Flexible Query Answering Systems
ICDT '99 Proceedings of the 7th International Conference on Database Theory
Generating Relations from XML Documents
ICDT '03 Proceedings of the 9th International Conference on Database Theory
Schema Mining: Finding Structural Regularity among Semistructured Data
PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Capturing and Querying Multiple Aspects of Semistructured Data
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
A New Conceptual Graph Generated Algorithm for Semi-structured Databases
WI '01 Proceedings of the First Asia-Pacific Conference on Web Intelligence: Research and Development
Extracting Information from Semistructured Data
WAIM '02 Proceedings of the Third International Conference on Advances in Web-Age Information Management
An Analysis of Alternative Methods for Storing Semistructured Data in Relations
ADBIS-DASFAA '00 Proceedings of the East-European Conference on Advances in Databases and Information Systems Held Jointly with International Conference on Database Systems for Advanced Applications: Current Issues in Databases and Information Systems
Evolving a Set of DTDs According to a Dynamic Set of XML Documents
EDBT '02 Proceedings of the Worshops XMLDM, MDDE, and YRWS on XML-Based Data Management and Multimedia Engineering-Revised Papers
An Approach to Classify Semi-structured Objects
ECOOP '99 Proceedings of the 13th European Conference on Object-Oriented Programming
Representing Web Data as Complex Objects
EC-WEB '00 Proceedings of the First International Conference on Electronic Commerce and Web Technologies
Logic-Based Approach to Semistructured Data Retrieval
ISMIS '00 Proceedings of the 12th International Symposium on Foundations of Intelligent Systems
Matching an XML Document against a Set of DTDs
ISMIS '02 Proceedings of the 13th International Symposium on Foundations of Intelligent Systems
Discovery of Frequent Tree Structured Patterns in Semistructured Web Documents
PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
The Use of Machine-Generated Ontologies in Dynamic Information Seeking
CooplS '01 Proceedings of the 9th International Conference on Cooperative Information Systems
Building Views over Semistructured Data Sources
ER '99 Proceedings of the 18th International Conference on Conceptual Modeling
SEuS: Structure Extraction Using Summaries
DS '02 Proceedings of the 5th International Conference on Discovery Science
Polynomial Time Algorithms for Finding Unordered Tree Patterns with Internal Variables
FCT '01 Proceedings of the 13th International Symposium on Fundamentals of Computation Theory
On the Difference between Navigating Semi-structured Data and Querying It
DBPL '99 Revised Papers from the 7th International Workshop on Database Programming Languages: Research Issues in Structured and Semistructured Database Programming
Ozone: Integrating Structured and Semistructured Data
DBPL '99 Revised Papers from the 7th International Workshop on Database Programming Languages: Research Issues in Structured and Semistructured Database Programming
Description logics for semantic query optimization in object-oriented database systems
ACM Transactions on Database Systems (TODS)
Mediation in a dynamic context: arguing for a request-oriented approach and structuring it
Web-enabled systems integration
Integrity issues in the Web: beyond distributed databases
Database integrity
Challenges in web search engines
ACM SIGIR Forum
Querying websites using compact skeletons
Journal of Computer and System Sciences - Special issu on PODS 2001
Web Data Cleansing and Preparation for Ontology Extraction Using WordNet
WISE '00 Proceedings of the First International Conference on Web Information Systems Engineering (WISE'00)-Volume 2 - Volume 2
Database management issues in the web environment
Effective databases for text & document management
View inference for heterogeneous XML information integration
Journal of Intelligent Information Systems - Special issue on web intelligence
Constraint-based wrapper specification and verification for cooperative information systems
Information Systems - Special issue: Data quality in cooperative information systems
A framework for modeling and evaluating automatic semantic reconciliation
The VLDB Journal — The International Journal on Very Large Data Bases
Knowledge and Information Systems
Graph transformation to infer schemata from XML documents
Proceedings of the 2005 ACM symposium on Applied computing
Study and Development of the DTD Generation System for XML Documents
Programming and Computing Software
Towards Ontology Generation from Tables
World Wide Web
Inference of concise DTDs from XML data
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Constraint-Based Approach to Semistructured Data
Fundamenta Informaticae - Intelligent Systems
Metadata management for federated databases
Proceedings of the ACM first workshop on CyberInfrastructure: information management in eScience
Inferring XML schema definitions from XML data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Measuring the structural similarity of semistructured documents using entropy
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Measuring the structural similarity among XML documents and DTDs
Journal of Intelligent Information Systems
OntoMiner: automated metadata and instance mining from news websites
International Journal of Web and Grid Services
Towards a global schema for web entities
Proceedings of the 17th international conference on World Wide Web
Extracting XML schema from multiple implicit xml documents based on inductive reasoning
Proceedings of the 17th international conference on World Wide Web
Output schemas of XSLT stylesheets and their applications
Information Sciences: an International Journal
Computer Languages, Systems and Structures
Challenges in web search engines
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Inference of concise regular expressions and DTDs
ACM Transactions on Database Systems (TODS)
Efficiently maintaining structural associations of semistructured data
PCI'01 Proceedings of the 8th Panhellenic conference on Informatics
Efficient algorithms for mining frequent and closed patterns from semi-structured data
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Extraction and exploitation of intensional knowledge from heterogeneous information sources: semi-automatic approaches and tools
Learning Deterministic Regular Expressions for the Inference of Schemas from XML Data
ACM Transactions on the Web (TWEB)
Finding maximal similar paths between XML documents using sequential patterns
ADVIS'04 Proceedings of the Third international conference on Advances in Information Systems
Sequential pattern mining for structure-based XML document classification
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
On the midpoint of a set of XML documents
DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
SWDB'04 Proceedings of the Second international conference on Semantic Web and Databases
Mining schemas in semistructured data using fuzzy decision trees
ISI'05 Proceedings of the 2005 IEEE international conference on Intelligence and Security Informatics
Mining schemas in semi-structured data using fuzzy decision trees
ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part IV
Information retrieval from distributed semistructured documents using metadata interface
KDXD'06 Proceedings of the First international conference on Knowledge Discovery from XML Documents
Proceedings of the 21st international conference companion on World Wide Web
Finding optimal probabilistic generators for XML collections
Proceedings of the 15th International Conference on Database Theory
On the linear relaxation of the p-median problem
Discrete Optimization
On the p-median polytope of Y-free graphs
Discrete Optimization
Constraint-Based Approach to Semistructured Data
Fundamenta Informaticae - Intelligent Systems
Measuring structural similarity of semistructured data based on information-theoretic approaches
The VLDB Journal — The International Journal on Very Large Data Bases
SchemEX - Efficient construction of a data catalogue by stream-based indexing of linked data
Web Semantics: Science, Services and Agents on the World Wide Web
Hierarchical clustering of XML documents focused on structural components
Data & Knowledge Engineering
Large-scale bisimulation of RDF graphs
Proceedings of the Fifth Workshop on Semantic Web Information Management
Hi-index | 0.00 |
Semistructured data is characterized by the lack of any fixed and rigid schema, although typically the data has some implicit structure. While the lack of fixed schema makes extracting semistructured data fairly easy and an attractive goal, presenting and querying such data is greatly impaired. Thus, a critical problem is the discovery of the structure implicit in semistructured data and, subsequently, the recasting of the raw data in terms of this structure. In this paper, we consider a very general form of semistructured data based on labeled, directed graphs. We show that such data can be typed using the greatest fixpoint semantics of monadic datalog programs. We present an algorithm for approximate typing of semistructured data. We establish that the general problem of finding an optimal such typing is NP-hard, but present some heuristics and techniques based on clustering that allow efficient and near-optimal treatment of the problem. We also present some preliminary experimental results.