BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
ACM Computing Surveys (CSUR)
Communications of the ACM
Handbook of data mining and knowledge discovery
Handbook of data mining and knowledge discovery
Incremental Clustering and Dynamic Information Retrieval
SIAM Journal on Computing
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Data Streams: Models and Algorithms (Advances in Database Systems)
Data Streams: Models and Algorithms (Advances in Database Systems)
The history of histograms (abridged)
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Efficient query processing for multi-dimensionally clustered tables in DB2
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Self-tuning database systems: a decade of progress
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Configuration-parametric query optimization for physical design tuning
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Handbook of Granular Computing
Handbook of Granular Computing
Brighthouse: an analytic data warehouse for ad-hoc queries
Proceedings of the VLDB Endowment
Architecture of a Database System
Foundations and Trends in Databases
The Database Architecture Jigsaw Puzzle
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Data warehouse technology by infobright
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Semantic knowledge integration to support inductive query optimization
DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Injecting domain knowledge into a granular database engine: a position paper
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Towards approximate SQL: infobright's approach
RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
Hi-index | 0.00 |
One of the major aspects of Infobright's relational database technology is automatic decomposition of each of data tables onto Rough Rows , each consisting of 64K of original rows. Rough Rows are automatically annotated by Knowledge Nodes that represent compact information about the rows' values. Query performance depends on the quality of Knowledge Nodes, i.e., their efficiency in minimizing the access to the compressed portions of data stored on disk, according to the specific query optimization procedures. We show how to implement the mechanism of organizing the incoming data into such Rough Rows that maximize the quality of the corresponding Knowledge Nodes. Given clear business-driven requirements, the implemented mechanism needs to be fully integrated with the data load process, causing no decrease in the data load speed. The performance gain resulting from better data organization is illustrated by some tests over our benchmark data. The differences between the proposed mechanism and some well-known procedures of database clustering or partitioning are discussed. The paper is a continuation of our patent application [22].