Indexing dataspaces

Authors:
Xin Dong;Alon Halevy
Affiliations:
University of Washington, Seattle, WA;Google Inc., Mountain View, CA
Venue:
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Year:
2007

Citing 31
Cited 31

Join indices

ACM Transactions on Database Systems (TODS)
Fast text searching for regular expressions or automaton searching on tries

Journal of the ACM (JACM)
Managing gigabytes (2nd ed.): compressing and indexing documents and images

Managing gigabytes (2nd ed.): compressing and indexing documents and images
Modern Information Retrieval

Modern Information Retrieval
APEX: an adaptive path index for XML data

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Covering indexes for branching path queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Holistic twig joins: optimal XML pattern matching

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Index Structures for Path Expressions

ICDT '99 Proceedings of the 7th International Conference on Database Theory
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Proximity Search in Databases

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
A Fast Index for Semistructured Data

Proceedings of the 27th International Conference on Very Large Data Bases
ViST: a dynamic index method for querying XML data by tree structures

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
D(k)-index: an adaptive structural summary for graph-structured data

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
DBXplorer: A System for Keyword-Based Search over Relational Databases

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Exploiting Local Similarity for Indexing Paths in Graph-Structured Data

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Keyword Searching and Browsing in Databases using BANKS

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
PRIX: Indexing And Querying XML Using Prüfer Sequences

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Multiresolution Indexing of XML for Frequent Queries

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
On the integration of structure indexes and inverted lists

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Efficient keyword search for smallest LCAs in XML databases

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
The SphereSearch engine for unified ranked retrieval of heterogeneous XML and web documents

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Optimizing scoring functions and indexes for proximity search in type-annotated corpora

Proceedings of the 15th international conference on World Wide Web
Principles of dataspace systems

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The Wikipedia XML corpus

ACM SIGIR Forum
Type less, find more: fast autocompletion search with a succinct index

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Index Structures for Matching XML Twigs Using Relational Query Processors

ICDEW '05 Proceedings of the 21st International Conference on Data Engineering Workshops
FIX: feature-based indexing technique for XML documents

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient structural joins on indexed XML documents

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Discover: keyword search in relational databases

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
XMark: a benchmark for XML data management

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Open information extraction from the web

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence

Data management projects at Google

ACM SIGMOD Record
Research on personal dataspace management

Proceedings of the 2nd SIGMOD PhD workshop on Innovative database research
Towards a theory of search queries

Proceedings of the 12th International Conference on Database Theory
Flexible query answering on graph-modeled data

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Hermes: Data Web search on a pay-as-you-go integration infrastructure

Web Semantics: Science, Services and Agents on the World Wide Web
3se: a semi-structured search engine for heterogeneous data in graph model

Proceedings of the 18th ACM conference on Information and knowledge management
Supporting context-based query in personal DataSpace

Proceedings of the 18th ACM conference on Information and knowledge management
Scalable indexing of RDF graphs for efficient join processing

Proceedings of the 18th ACM conference on Information and knowledge management
IBM UFO repository: object-oriented data integration

Proceedings of the VLDB Endowment
Indexing relations on the web

Proceedings of the 13th International Conference on Extending Database Technology
Querying structured information sources on the Web

International Journal of Metadata, Semantics and Ontologies
Development of foundation models for Internet of Things

Frontiers of Computer Science in China
Towards a theory of search queries

ACM Transactions on Database Systems (TODS)
Towards large-scale scientific dataspaces for e-science applications

DASFAA'10 Proceedings of the 15th international conference on Database systems for advanced applications
From web data to entities and back

CAiSE'10 Proceedings of the 22nd international conference on Advanced information systems engineering
On-the-fly entity-aware query processing in the presence of linkage

Proceedings of the VLDB Endowment
A context-based model for the interpretation of polysemous terms

OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems: Part II
Schema-as-you-go: on probabilistic tagging and querying of wide tables

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Supporting queries spanning across phases of evolving artifacts using Steiner forests

Proceedings of the 20th ACM international conference on Information and knowledge management
Searching web data: An entity retrieval and high-performance indexing model

Web Semantics: Science, Services and Agents on the World Wide Web
A node indexing scheme for web entity retrieval

ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part II
Combining query translation with query answering for efficient keyword search

ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part II
Social health data integration using semantic Web

Proceedings of the 27th Annual ACM Symposium on Applied Computing
KESOSD: keyword search over structured data

KEYS '12 Proceedings of the Third International Workshop on Keyword Search on Structured Data
3SEPIAS: A Semi-Structured Search Engine for Personal Information in dAtaspace System

Information Sciences: an International Journal
Indexing dataspaces with partitions

World Wide Web
An intelligent RDF management system with hybrid querying approach

ICCCI'12 Proceedings of the 4th international conference on Computational Collective Intelligence: technologies and applications - Volume Part I
Incrementally improving dataspaces based on user feedback

Information Systems
Comparable dependencies over heterogeneous data

The VLDB Journal — The International Journal on Very Large Data Bases
Editorial: Querying linked data graphs using semantic relatedness: A vocabulary independent approach

Data & Knowledge Engineering
Entity ranking using click-log information

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Dataspaces are collections of heterogeneous and partially unstructured data. Unlike data-integration systems that also offer uniform access to heterogeneous data sources, dataspaces do not assume that all the semantic relationships between sources are known and specified. Much of the user interaction with dataspaces involves exploring the data, and users do not have a single schema to which they can pose queries. Consequently, it is important that queries are allowed to specify varying degrees of structure, spanning keyword queries to more structure-aware queries. This paper considers indexing support for queries that combine keywords and structure. We describe several extensions to inverted lists to capture structure when it is present. In particular, our extensions incorporate attribute labels, relationships between data items, hierarchies of schema elements, and synonyms among schema elements. We describe experiments showing that our indexing techniques improve query efficiency by an order of magnitude compared with alternative approaches, and scale well with the size of the data.