AlphaSum: size-constrained table summarization using value lattices

Authors:
K. Selçuk Candan;Huiping Cao;Yan Qi;Maria Luisa Sapino
Affiliations:
Arizona State Univ., Tempe, AZ;Arizona State Univ., Tempe, AZ;Arizona State Univ., Tempe, AZ;Univ. di Torino, Torino, Italy
Venue:
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Year:
2009

Citing 27
Cited 2

Combining fuzzy information from multiple systems (extended abstract)

PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
An overview of data warehousing and OLAP technology

ACM SIGMOD Record
Engineering the compression of massive tables: an experimental approach

SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Introduction to Algorithms

Introduction to Algorithms
Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
Relational Database Compression Using Augmented Vector Quantization

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
STORM: A Statistical Object Representation Model

Proceedings of the 5th International Conference SSDBM on Statistical and Scientific Database Management
Transforming data to satisfy privacy constraints

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Supporting Imprecision in Multidimensional Databases Using Granularities

SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
Dynamic Refinement of Table Summarization or M-Commerce

WECWIS '02 Proceedings of the Fourth IEEE International Workshop on Advanced Issues of E-Commerce and Web-Based Information Systems (WECWIS'02)
TabSum: A Flexible and Dynamic Table Summarization Approach

ICDCS '00 Proceedings of the The 20th International Conference on Distributed Computing Systems ( ICDCS 2000)
Hand-OLAP: A System for Delivering OLAP Services on Handheld Devices

ISADS '03 Proceedings of the The Sixth International Symposium on Autonomous Decentralized Systems (ISADS'03)
Optimal aggregation algorithms for middleware

Journal of Computer and System Sciences - Special issu on PODS 2001
Improving table compression with combinatorial optimization

Journal of the ACM (JACM)
Top-Down Specialization for Information and Privacy Preservation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On the complexity of optimal K-anonymity

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Incognito: efficient full-domain K-anonymity

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
General purpose database summarization

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Capturing summarizability with integrity constraints in OLAP

ACM Transactions on Database Systems (TODS)
A quad-tree based multiresolution approach for two-dimensional summary data

SSDBM '03 Proceedings of the 15th International Conference on Scientific and Statistical Database Management
Approximate algorithms for K-anonymity

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
FICSR: feedback-based inconsistency resolution and query processing on misaligned data sources

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Integrating and querying taxonomies with quest in the presence of conflicts

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Thoughts on k-anonymization

Data & Knowledge Engineering
Supporting OLAP operations over imperfectly integrated taxonomies

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A hierarchy-driven compression technique for advanced OLAP visualization of multidimensional data cubes

DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
Data summarization approach to relational domain learning based on frequent pattern to support the development of decision making

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications

Reducing metadata complexity for faster table summarization

Proceedings of the 13th International Conference on Extending Database Technology
Generating qualified summarization answers using fuzzy concept hierarchies

Proceedings of the 2010 Symposium on Information and Communication Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Consider a scientist who wants to explore multiple data sets to select the relevant ones for further analysis. Since the visualization real estate may put a stringent constraint on how much detail can be presented to this user in a single page, effective table summarization techniques are needed to create summaries that are both sufficiently small and effective in communicating the available content. In this paper, we first argue that table summarization can benefit from knowledge about acceptable value clustering alternatives for clustering the values in the database. We formulate the problem of table summarization with the help of value lattices. We then provide a framework to express alternative clustering strategies and to account for various utility measures (such as information loss) in assessing different summarization alternatives. Based on this interpretation, we introduce three preference criteria, max-min-util (cautious), max-sum-util (cumulative), and pareto-util, for the problem of table summarization. To tackle with the inherent complexity, we rely on the properties of the fuzzy interpretation to further develop a novel ranked set cover based evaluation mechanism (RSC). These are brought together in an AlphaSum, table summarization system. Experimental evaluations showed that RSC improves both execution times and the summary qualities in AlphaSum, by pruning the search space more effectively than the existing solutions.