Spark: top-k keyword query in relational databases

Authors:
Yi Luo;Xuemin Lin;Wei Wang;Xiaofang Zhou
Affiliations:
University of New South Wales, Sydney, Australia;University of New South Wales, Sydney, Australia;University of New South Wales, Sydney, Australia;University of Queensland, Brisbane, Australia
Venue:
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Year:
2007

Citing 24
Cited 79

Ripple joins for online aggregation

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Combining fuzzy information from multiple systems

Journal of Computer and System Sciences
Extended Boolean information retrieval

Communications of the ACM
Optimal aggregation algorithms for middleware

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
PowerDB-IR: information retrieval on top of a database cluster

Proceedings of the tenth international conference on Information and knowledge management
Minimal probing: supporting expensive predicates for top-k queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
The Skyline Operator

Proceedings of the 17th International Conference on Data Engineering
Proximity Search in Databases

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Supporting Incremental Join Queries on Ranked Inputs

Proceedings of the 27th International Conference on Very Large Data Bases
DBXplorer: A System for Keyword-Based Search over Relational Databases

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Keyword Searching and Browsing in Databases using BANKS

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Supporting top-k join queries in relational databases

The VLDB Journal — The International Journal on Very Large Data Bases
Simple BM25 extension to multiple weighted fields

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Bidirectional expansion for keyword search on graph databases

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Indexing Relational Database Content Offline for Efficient Keyword-Based Search

IDEAS '05 Proceedings of the 9th International Database Engineering & Application Symposium
Précis: The Essence of a Query Answer

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Efficient Aggregation of Ranked Inputs

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Finding and approximating top-k answers in keyword proximity search

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Effective keyword search in relational databases

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Towards robust indexing for ranked queries

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Answering top-k queries using views

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
IO-Top-k: index-access optimized top-k query processing

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Discover: keyword search in relational databases

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficient IR-style keyword search over relational databases

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29

Leveraging semantic technologies for enterprise search

Proceedings of the ACM first Ph.D. workshop in CIKM
Effective keyword search for valuable lcas over xml documents

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Probabilistic ranked queries in uncertain databases

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
ARCube: supporting ranking aggregate queries in partially materialized data cubes

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Query biased snippet generation in XML search

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A graph method for keyword-based selection of the top-K databases

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Keyword query cleaning

Proceedings of the VLDB Endowment
Keyword search on external memory data graphs

Proceedings of the VLDB Endowment
Mapping enterprise entities to text segments

Proceedings of the 2nd PhD workshop on Information and knowledge management
Retune: Retrieving and Materializing Tuple Units for Effective Keyword Search over Relational Databases

ER '08 Proceedings of the 27th International Conference on Conceptual Modeling
Practical and effective IR-style keyword search over semantic web

Information Processing and Management: an International Journal
Towards an integrated framework for querying collection of heterogeneous data

Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication
Answering aggregate keyword queries on relational databases using minimal group-bys

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Finding frequent co-occurring terms in relational keyword search

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
MarcoPolo: a community system for sharing and integrating travel information on maps

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Effective Fuzzy Keyword Search over Uncertain Data

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Query segmentation using conditional random fields

Proceedings of the First International Workshop on Keyword Search on Structured Data
Do we mean the same?: disambiguation of extracted keyword queries for database search

Proceedings of the First International Workshop on Keyword Search on Structured Data
Keyword search in databases: the power of RDBMS

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Efficient type-ahead search on relational data: a TASTIER approach

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Keyword search on structured and semi-structured data

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Keyword search over relational tables and streams

ACM Transactions on Database Systems (TODS)
Distributed top-k aggregation queries at large

Distributed and Parallel Databases
SAIL: Structure-aware indexing for effective and progressive top-k keyword search over XML documents

Information Sciences: an International Journal
Efficient keyword proximity search using a frontier-reduce strategy based on d-distance graph index

IDEAS '09 Proceedings of the 2009 International Database Engineering & Applications Symposium
Structure-aware indexing for keyword search in databases

Proceedings of the 18th ACM conference on Information and knowledge management
Context-sensitive document ranking

Proceedings of the 18th ACM conference on Information and knowledge management
Retrieving good, better, and best answers to questions in advertisements

Proceedings of the eleventh international workshop on Web information and data management
Finding and ranking compact connected trees for effective keyword proximity search in XML documents

Information Systems
Structured search result differentiation

Proceedings of the VLDB Endowment
EasyKSORD: A Platform of Keyword Search Over Relational Databases

WISM '09 Proceedings of the International Conference on Web Information Systems and Mining
Fast ELCA computation for keyword queries on XML data

Proceedings of the 13th International Conference on Extending Database Technology
PerK: personalized keyword search in relational databases through preferences

Proceedings of the 13th International Conference on Extending Database Technology
Text-to-query: dynamically building structured analytics to illustrate textual content

Proceedings of the 2010 EDBT/ICDT Workshops
Graph-based concept identification and disambiguation for enterprise search

Proceedings of the 19th international conference on World wide web
Improving XML search by generating and utilizing informative result snippets

ACM Transactions on Database Systems (TODS)
Structural consistency: enabling XML keyword search to eliminate spurious results consistently

The VLDB Journal — The International Journal on Very Large Data Bases
Structured data retrieval using cover density ranking

Proceedings of the 2nd International Workshop on Keyword Search on Structured Data
A framework for evaluating database keyword search strategies

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Efficient continuous top-k keyword search in relational databases

WAIM'10 Proceedings of the 11th international conference on Web-age information management
An effective 3-in-1 keyword search method over heterogeneous data sources

Information Systems
Ten thousand SQLs: parallel keyword queries computing

Proceedings of the VLDB Endowment
Toward scalable keyword search over relational data

Proceedings of the VLDB Endowment
Searching workflows with hierarchical views

Proceedings of the VLDB Endowment
A novel keyword search paradigm in relational databases: Object summaries

Data & Knowledge Engineering
XRCJ: supporting keyword search in XML and relation co-occurrence

WAIM'10 Proceedings of the 2010 international conference on Web-age information management
Probabilistic inverse ranking queries in uncertain databases

The VLDB Journal — The International Journal on Very Large Data Bases
Scalable keyword search on large data streams

The VLDB Journal — The International Journal on Very Large Data Bases
Providing built-in keyword search capabilities in RDBMS

The VLDB Journal — The International Journal on Very Large Data Bases
Context-sensitive document ranking

Journal of Computer Science and Technology
TopRecs: Top-k algorithms for item-based collaborative filtering

Proceedings of the 14th International Conference on Extending Database Technology
Keyword search over relational databases: a metadata approach

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Efficient similarity joins for near-duplicate detection

ACM Transactions on Database Systems (TODS)
Keyword query cleaning with query logs

WAIM'11 Proceedings of the 12th international conference on Web-age information management
Efficient top-K approximate searches against a relation with multiple attributes

World Wide Web
Index structures and top-k join algorithms for native keyword search databases

Proceedings of the 20th ACM international conference on Information and knowledge management
Ranking support for keyword search on structured data using relevance models

Proceedings of the 20th ACM international conference on Information and knowledge management
Learning to rank results in relational keyword search

Proceedings of the 20th ACM international conference on Information and knowledge management
Skynets: searching for minimum trees in graphs with incomparable edge weights

Proceedings of the 20th ACM international conference on Information and knowledge management
Data-thirsty business analysts need SODA: search over data warehouse

Proceedings of the 20th ACM international conference on Information and knowledge management
Size-l object summaries for relational keyword search

Proceedings of the VLDB Endowment
REX: explaining relationships between entity pairs

Proceedings of the VLDB Endowment
Retrieving keyworded subgraphs with graph ranking score

Expert Systems with Applications: An International Journal
Language models for keyword search over data graphs

Proceedings of the fifth ACM international conference on Web search and data mining
iSearch: an interpretation based framework for keyword search in relational databases

KEYS '12 Proceedings of the Third International Workshop on Keyword Search on Structured Data
STRUCT: incorporating contextual information for English query search on relational databases

KEYS '12 Proceedings of the Third International Workshop on Keyword Search on Structured Data
KESOSD: keyword search over structured data

KEYS '12 Proceedings of the Third International Workshop on Keyword Search on Structured Data
Scalable top-k keyword search in relational databases

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part II
Exploiting and Maintaining Materialized Views for XML Keyword Queries

ACM Transactions on Internet Technology (TOIT)
Detecting near-duplicate documents using sentence-level features and supervised learning

Expert Systems with Applications: An International Journal
Incorporating compactness to generate term-association view snippets for ontology search

Information Processing and Management: an International Journal
Keyword search on form results

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient query construction for large scale data

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Efficient Top-k Keyword Search Over Multidimensional Databases

International Journal of Data Warehousing and Mining
Top-K structural diversity search in large networks

Proceedings of the VLDB Endowment
Semantics-based keyword search over XML and relational databases

Proceedings of the Fourth Symposium on Information and Communication Technology
Effective ranking and search techniques for Web resources considering semantic relationships

Information Processing and Management: an International Journal
Probabilistic query rewriting for efficient and effective keyword search on graph data

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.02

Visualization

Abstract

With the increasing amount of text data stored in relational databases, there is a demand for RDBMS to support keyword queries over text data. As a search result is often assembled from multiple relational tables, traditional IR-style ranking and query evaluation methods cannot be applied directly. In this paper, we study the effectiveness and the efficiency issues of answering top-k keyword query in relational database systems. We propose a new ranking formula by adapting existing IR techniques based on a natural notion of virtual document. Compared with previous approaches, our new ranking method is simple yet effective, and agrees with human perceptions. We also study efficient query processing methods for the new ranking method, and propose algorithms that have minimal accesses to the database. We have conducted extensive experiments on large-scale real databases using two popular RDBMSs. The experimental results demonstrate significant improvement to the alternative approaches in terms of retrieval effectiveness and efficiency.