Probabilistic counting algorithms for data base applications
Journal of Computer and System Sciences
Approximating the number of unique values of an attribute without sorting
Information Systems
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Duplicate record elimination in large data files
ACM Transactions on Database Systems (TODS)
System R: relational approach to database management
ACM Transactions on Database Systems (TODS)
A relational model of data for large shared data banks
Communications of the ACM
Systolic (VLSI) arrays for relational database operations
SIGMOD '80 Proceedings of the 1980 ACM SIGMOD international conference on Management of data
A Relational Algebraic Approach to Protocol Verification
IEEE Transactions on Software Engineering
Protocol Verification Using Relational Database Systems
Proceedings of the Third International Conference on Data Engineering
Main Memory Database Research Directions
IWDM '89 Proceedings of the Sixth International Workshop on Database Machines
Special Function Unit for Statistical Aggregation Functions
IWDM '89 Proceedings of the Sixth International Workshop on Database Machines
Data reduction through early grouping
CASCON '94 Proceedings of the 1994 conference of the Centre for Advanced Studies on Collaborative research
Hi-index | 0.00 |
It is shown that the existence of duplicate values in some attribute columns has a significant impact on the computational complexity of the sorting and joining operations. This is especially true when the number of distinct tuple values is a small fraction of the total number of tuples. The authors characterize a multirelation M(n, L) by its cardinality n and the number of distinct elements L it contains. Under this characterization, the worst time complexity of sorting such a multirelation with binary comparisons as basic operations is investigated. Upper and lower bounds on the number of three-branch comparisons needed to sort such a multirelation are established. Thereafter, the methodology used to study the complexity of sorting is applied to the natural join operation. It is shown that the existence of duplicate values in the join attribute columns can be exploited to reduce the computational complexity of the natural join operation.