ACM Computing Surveys (CSUR)
Join processing in relational databases
ACM Computing Surveys (CSUR)
Query evaluation techniques for large databases
ACM Computing Surveys (CSUR)
Principles of database query processing for advanced applications
Principles of database query processing for advanced applications
An overview of query optimization in relational systems
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Efficient and extensible algorithms for multi query optimization
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Query Processing in Parallel Relational Database Systems
Query Processing in Parallel Relational Database Systems
DBXplorer: A System for Keyword-Based Search over Relational Databases
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Keyword Searching and Browsing in Databases using BANKS
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Bidirectional expansion for keyword search on graph databases
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Finding and approximating top-k answers in keyword proximity search
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Effective keyword search in relational databases
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Spark: top-k keyword query in relational databases
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
BLINKS: ranked keyword searches on graphs
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Efficient exploitation of similar subexpressions for query processing
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Keyword search on relational data streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Discover: keyword search in relational databases
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Improving Static Task Scheduling in Heterogeneous and Homogeneous Computing Systems
ICPP '07 Proceedings of the 2007 International Conference on Parallel Processing
Efficient IR-style keyword search over relational databases
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Objectrank: authority-based keyword search in databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Authority-based keyword search in databases
ACM Transactions on Database Systems (TODS)
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Keyword proximity search in complex data graphs
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Scalable multi-query optimization for exploratory queries over federated scientific databases
Proceedings of the VLDB Endowment
Keyword search on external memory data graphs
Proceedings of the VLDB Endowment
Scalable Keyword Search on Large Data Streams
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Parallel Skyline Computation on Multicore Architectures
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Querying Communities in Relational Databases
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Compaction of Schedules and a Two-Stage Approach for Duplication-Based DAG Scheduling
IEEE Transactions on Parallel and Distributed Systems
Keyword search in databases: the power of RDBMS
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Mining tree-structured data on multicore systems
Proceedings of the VLDB Endowment
Keyword Search in Databases
Index structures and top-k join algorithms for native keyword search databases
Proceedings of the 20th ACM international conference on Information and knowledge management
Efficient Top-k Keyword Search Over Multidimensional Databases
International Journal of Data Warehousing and Mining
Hi-index | 0.00 |
Keyword search in relational databases has been extensively studied. Given a relational database, a keyword query finds a set of interconnected tuple structures connected by foreign key references. On rdbms, a keyword query is processed in two steps, namely, candidate networks (CNs) generation and CNs evaluation, where a CN is an sql. In common, a keyword query needs to be processed using over 10,000 sqls. There are several approaches to process a keyword query on rdbms, but there is a limit to achieve high performance on a uniprocessor architecture. In this paper, we study parallel computing keyword queries on a multicore architecture. We give three observations on keyword query computing, namely, a large number of sqls that needs to be processed, high sharing possibility among sqls, and large intermediate results with small number of final results. All make it challenging for parallel keyword queries computing. We investigate three approaches. We first study the query level parallelism, where each sql is processed by one core. We distribute the sqls into different cores based on three objectives, regarding minimizing workload skew, minimizing intercore sharing and maximizing intra-core sharing respectively. Such an approach has the potential risk of load unbalancing through accumulating errors of cost estimation. We then study the operation level parallelism, where each operation of an sql is processed by one core. All operations are processed in stages, where in each stage the costs of operations are re-estimated to reduce the accumulated error. Such operation level parallelism still has drawbacks of workload skew when large operations are involved and a large number of cores are used. Finally, we propose a new algorithm that partitions relations adaptively in order to minimize the extra cost of partitioning and at the same time reduce workload skew. We conducted extensive performance studies using two large real datasets, DBLP and IMDB, and we report the efficiency of our approaches in this paper.