PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On the design and quantification of privacy preserving data mining algorithms
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
An information-theoretic approach to normal forms for relational and XML data
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Limiting privacy breaches in privacy preserving data mining
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Privacy preserving mining of association rules
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
On schema matching with opaque column names and data values
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Information-theoretic tools for mining database structure from large data sets
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
An information-theoretic approach to normal forms for relational and XML data
Journal of the ACM (JACM)
An information theoretic model for database alignment
SSDBM'2005 Proceedings of the 17th international conference on Scientific and statistical database management
Rapid Identification of Column Heterogeneity
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Type-based categorization of relational attributes
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Validating Multi-column Schema Matchings by Type
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Truth finding on the deep web: is the problem solved?
Proceedings of the VLDB Endowment
Hi-index | 0.01 |
We are awash in data. The explosion in computing power and computing infrastructure allows us to generate multitudes of data, in differing formats, at different scales, and in inter-related areas. Data management is fundamentally about the harnessing of this data to extract information, discovering good representations of the information, and analyzing information sources to glean structure. Data management generally presents us with cost-benefit tradeoffs. If we store more information, we get better answers to queries, but we pay the price in terms of increased storage. Conversely, reducing the amount of information we store improves performance at the cost of decreased accuracy for query results. The ability to quantify information gain or loss can only improve our ability to design good representations, storage mechanisms, and analysis tools for data.