Rearranging data to maximize the efficiency of compression
PODS '86 Proceedings of the fifth ACM SIGACT-SIGMOD symposium on Principles of database systems
Algorithms for clustering data
Algorithms for clustering data
Introduction to statistical pattern recognition (2nd ed.)
Introduction to statistical pattern recognition (2nd ed.)
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Efficiently supporting ad hoc queries in large datasets of time sequences
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
SPARTAN: a model-based semantic compression system for massive data tables
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
A Microeconomic View of Data Mining
Data Mining and Knowledge Discovery
Block-Oriented Compression Techniques for Large Statistical Databases
IEEE Transactions on Knowledge and Data Engineering
Constraint-based clustering in large databases
ICDT '01 Proceedings of the 8th International Conference on Database Theory
Spatial Clustering in the Presence of Obstacles
Proceedings of the 17th International Conference on Data Engineering
Semantic Compression and Pattern Extraction with Fascicles
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
A Projection Pursuit Algorithm for Exploratory Data Analysis
IEEE Transactions on Computers
General purpose database summarization
VLDB '05 Proceedings of the 31st international conference on Very large data bases
XQueC: A query-conscious compressed XML database
ACM Transactions on Internet Technology (TOIT)
Mine your own business, mine others' news!
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Discovering data quality rules
Proceedings of the VLDB Endowment
On domination game analysis for microeconomic data mining
ACM Transactions on Knowledge Discovery from Data (TKDD)
Semantic enabled metadata management in PetaShare
International Journal of Grid and Utility Computing
Time sequence summarization to scale up chronology-dependent applications
Proceedings of the 18th ACM conference on Information and knowledge management
Synopses for probabilistic data over large domains
Proceedings of the 14th International Conference on Extending Database Technology
Document decomposition for XML compression: a heuristic approach
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Data summarization for network traffic monitoring
Journal of Network and Computer Applications
Hi-index | 0.00 |
Real datasets are often large enough to necessitate datacompression. Traditional 'syntactic' data compression methodstreat the table as a large byte string and operate at thebyte level. The tradeoff in such cases is usually between theease of retrieval (the ease with which one can retrieve a singletuple or attribute value without decompressing a much largerunit) and the effectiveness of the compression. In this regard,the use of semantic compression has generated considerableinterest and motivated certain recent works.In this paper, we propose a semantic compression algorithmcalled ItCompress ITerative Compression, whichachieves good compression while permitting access even atattribute level without requiring the decompression of a largerunit. ItCompress iteratively improves the compression ratioof the compressed output during each scan of the table. Theamount of compression can be tuned based on the number ofiterations. Moreover, the initial iterations provide significantcompression, thereby making it a cost-effective compressiontechnique. Extensive experiments were conducted and the resultsindicate the superiority of ItCompress with respect topreviously known tehniques, such as 'SPARTAN' and 'fascicles'.