Towards keyword-driven analytical processing

Authors:
Ping Wu;Yannis Sismanis;Berthold Reinwald
Affiliations:
University of California, Santa Barbara, CA;IBM Almaden Research Center, San Jose, CA;IBM Almaden Research Center, San Jose, CA
Venue:
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Year:
2007

Citing 14
Cited 21

An overview of data warehousing and OLAP technology

ACM SIGMOD Record
Integrating keyword search into XML query processing

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Faceted metadata for image search and browsing

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Discovery-Driven Exploration of OLAP Data Cubes

EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Explaining Differences in Multidimensional Aggregates

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
XRANK: ranked keyword search over XML documents

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Keyword Searching and Browsing in Databases using BANKS

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Clustering versus faceted categories for information exploration

Communications of the ACM - Supporting exploratory search
Effective keyword search in relational databases

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Bellwether analysis: predicting global aggregates from local regions

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Discover: keyword search in relational databases

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficient IR-style keyword search over relational databases

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Schema-free XQuery

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Probabilistic ranking of database query results

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Supporting OLAP operations over imperfectly integrated taxonomies

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
SQAK: doing more with keywords

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Multidimensional content eXploration

Proceedings of the VLDB Endowment
Dynamic active probing of helpdesk databases

Proceedings of the VLDB Endowment
Keyword query cleaning

Proceedings of the VLDB Endowment
WebContent: efficient P2P Warehousing of web data

Proceedings of the VLDB Endowment
Minimum-effort driven dynamic faceted search in structured databases

Proceedings of the 17th ACM conference on Information and knowledge management
Answering aggregate keyword queries on relational databases using minimal group-bys

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Finding frequent co-occurring terms in relational keyword search

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Query segmentation using conditional random fields

Proceedings of the First International Workshop on Keyword Search on Structured Data
Do we mean the same?: disambiguation of extracted keyword queries for database search

Proceedings of the First International Workshop on Keyword Search on Structured Data
Keyword search on structured and semi-structured data

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Keyword search over relational tables and streams

ACM Transactions on Database Systems (TODS)
Recommending Multidimensional Queries

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
WikiAnalytics: disambiguation of keyword search results on highly heterogeneous structured data

Procceedings of the 13th International Workshop on the Web and Databases
FACeTOR: cost-driven exploration of faceted query results

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Keyword query cleaning with query logs

WAIM'11 Proceedings of the 12th international conference on Web-age information management
TEXplorer: keyword-based object search and exploration in multidimensional text databases

Proceedings of the 20th ACM international conference on Information and knowledge management
iSearch: an interpretation based framework for keyword search in relational databases

KEYS '12 Proceedings of the Third International Workshop on Keyword Search on Structured Data
Efficient and Effective Aggregate Keyword Search on Relational Databases

International Journal of Data Warehousing and Mining
Efficient query construction for large scale data

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.01

Visualization

Abstract

Gaining business insights from data has recently been the focus of research and product development. On Line-Analytical Processing (OLAP) tools provide elaborate query languages that allow users to group and aggregate data in various ways, and explore interesting trends and patterns in the data. However, the dynamic nature of today's data along with the overwhelming detail at which data is provided, make it nearly impossible to organize the data in a way that a business analyst needs for thinking about the data. In this paper, we introduce "Keyword-Driven Analytical Processing" (KDAP), which combines intuitive keyword-based search with the power of aggregation in OLAP without having to spend considerable effort in organizing the data in terms that the business analyst understands. Our design point is around a user mentality that we frequently encounter: "users don't know how to specify what they want, but they know it when they see it". We present our complete solution framework, which implements various phases from disambiguating the keyword terms to organizing and ranking the results in dynamic facets, that allow the user to explore efficiently the aggregation space. We address specific issues that analysts encounter, like joins, groupings and aggregations, and we provide efficient and scalable solutions. We show, how KDAP can handle both categorical and numerical data equally well and, finally, we demonstrate the generality and applicability of KDAP to two different aspects of OLAP, namely, finding exceptions or surprises in the data and finding bellwether regions where local aggregates are highly correlated with global aggregates, using various experiments on real data.