WordNet: a lexical database for English
Communications of the ACM
Optimal aggregation algorithms for middleware
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Modern Information Retrieval
DBXplorer: enabling keyword search over relational databases
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Approximate String-Matching over Suffix Trees
CPM '93 Proceedings of the 4th Annual Symposium on Combinatorial Pattern Matching
XRANK: ranked keyword search over XML documents
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
A compression-based algorithm for Chinese word segmentation
Computational Linguistics
Keyword Proximity Search in XML Trees
IEEE Transactions on Knowledge and Data Engineering
Integrating Unstructured Data into Relational Databases
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Effective keyword search in relational databases
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Spark: top-k keyword query in relational databases
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Keyword search on relational data streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Towards keyword-driven analytical processing
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Discover: keyword search in relational databases
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficient IR-style keyword search over relational databases
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Query segmentation using conditional random fields
Proceedings of the First International Workshop on Keyword Search on Structured Data
Proceedings of the First International Workshop on Keyword Search on Structured Data
Keyword search on structured and semi-structured data
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Structured annotations of web queries
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Online annotation of text streams with structured entities
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Helix: online enterprise data analytics
Proceedings of the 20th international conference companion on World wide web
View-based model-driven architecture for enhancing maintainability of data access services
Data & Knowledge Engineering
Matching unstructured product offers to structured product specifications
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Keyword query cleaning with query logs
WAIM'11 Proceedings of the 12th international conference on Web-age information management
Spelling suggestion for XML keyword search based on pairwise keyword summaries
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
A distance-based spelling suggestion method for XML keyword search
ER'12 Proceedings of the 31st international conference on Conceptual Modeling
Normalised LCS-based method for indexing multidimensional data cube
International Journal of Intelligent Information and Database Systems
Exploiting structures in keyword queries for effective XML search
Information Sciences: an International Journal
Question answering on interlinked data
Proceedings of the 22nd international conference on World Wide Web
Efficient parsing-based search over structured data
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Probabilistic query rewriting for efficient and effective keyword search on graph data
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Unlike traditional database queries, keyword queries do not adhere to predefined syntax and are often dirty with irrelevant words from natural languages. This makes accurate and efficient keyword query processing over databases a very challenging task. In this paper, we introduce the problem of query cleaning for keyword search queries in a database context and propose a set of effective and efficient solutions. Query cleaning involves semantic linkage and spelling corrections of database relevant query words, followed by segmentation of nearby query words such that each segment corresponds to a high quality data term. We define a quality metric of a keyword query, and propose a number of algorithms for cleaning keyword queries optimally. It is demonstrated that the basic optimal query cleaning problem can be solved using a dynamic programming algorithm. We further extend the basic algorithm to address incremental query cleaning and top-k optimal query cleaning. The incremental query cleaning is efficient and memory-bounded, hence is ideal for scenarios in which the keywords are streamed. The top-k query cleaning algorithm is guaranteed to return the best k cleaned keyword queries in ranked order. Extensive experiments are conducted on three real-life data sets, and the results confirm the effectiveness and efficiency of the proposed solutions.