Multi-table joins through bitmapped join indices
ACM SIGMOD Record
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Improved query performance with variant indexes
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
On saying “Enough already!” in SQL
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
An efficient bitmap encoding scheme for selection queries
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
CACTUS—clustering categorical data using summaries
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
ACM Computing Surveys (CSUR)
Mining the stock market (extended abstract): which measure is best?
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: concepts and techniques
Data mining: concepts and techniques
Scalability for clustering algorithms revisited
ACM SIGKDD Explorations Newsletter
Optimal aggregation algorithms for middleware
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
PREFER: a system for the efficient execution of multi-parametric ranked queries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Encoded Bitmap Indexing for Data Warehouses
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Model 204 Architecture and Performance
Proceedings of the 2nd International Workshop on High Performance Transaction Systems
WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Evaluating Top-k Selection Queries
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Probabilistic Optimization of Top N Queries
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
STING: A Statistical Information Grid Approach to Spatial Data Mining
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Automatic categorization of query results
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
RankSQL: query algebra and optimization for relational top-k queries
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Optimizing bitmap indices with efficient compression
ACM Transactions on Database Systems (TODS)
A Generalized K-Means Algorithm with Semi-Supervised Weight Coefficients
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 03
Supporting top-K join queries in relational databases
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Bitmap indexes for large scientific data sets: a case study
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Weighted k-means for density-biased clustering
DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
Cluster By: a new sql extension for spatial data aggregation
Proceedings of the 15th annual ACM international symposium on Advances in geographic information systems
Probabilistic ranked queries in uncertain databases
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Efficient computation of personal aggregate queries on blogs
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Exploiting similarity-aware grouping in decision support systems
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Rank-aware clustering of structured datasets
Proceedings of the 18th ACM conference on Information and knowledge management
Cluster based rank query over multidimensional data streams
Proceedings of the 18th ACM conference on Information and knowledge management
Using trees to depict a forest
Proceedings of the VLDB Endowment
Grouping Results of Queries to Ontological Knowledge Bases by Conceptual Clustering
ICCCI '09 Proceedings of the 1st International Conference on Computational Collective Intelligence. Semantic Web, Social Networks and Multiagent Systems
Query Results Clustering by Extending SPARQL with CLUSTER BY
OTM '09 Proceedings of the Confederated International Workshops and Posters on On the Move to Meaningful Internet Systems: ADI, CAMS, EI2N, ISDE, IWSSA, MONET, OnToContent, ODIS, ORM, OTM Academy, SWWS, SEMELS, Beyond SAWSDL, and COMBEK 2009
Ranking weak-linked documents on the web
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Querying streaming point clusters as regions
Proceedings of the ACM SIGSPATIAL International Workshop on GeoStreaming
On a fuzzy group-by and its use for fuzzy association rule mining
ADBIS'10 Proceedings of the 14th east European conference on Advances in databases and information systems
Making interval-based clustering rank-aware
Proceedings of the 14th International Conference on Extending Database Technology
Skimmer: rapid scrolling of relational query results
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Similarity queries: their conceptual evaluation, transformations, and processing
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
The Boolean semantics of SQL queries cannot adequately capture the "fuzzy" preferences and "soft" criteria required in non-traditional data retrieval applications. One way to solve this problem is to add a flavor of "information retrieval" into database queries by allowing fuzzy query conditions and flexibly supporting grouping and ranking of the query results within the DBMS engine. While ranking is already supported by all major commercial DBMSs natively, support of flexibly grouping is still very limited (i.e., group-by). In this paper, we propose to generalize group-by to enable flexible grouping (clustering specifically) of the query results. Different from clustering in data mining applications, our focus is on supporting efficient clustering of Boolean results generated at query time. Moreover, we propose to integrate ranking and clustering with Boolean conditions, forming a new type of ClusterRank query to allow structured data retrieval. Such an integration is nontrivial in terms of both semantics and query processing. We investigate various semantics of this type of queries. To process such queries, a straightforward approach is to simply glue the techniques developed for ranking-only and clustering-only together. This approach is costly since both ranking and clustering are treated as blocking post-processing tasks upon Boolean query results by existing techniques. We propose a summary-based evaluation method that utilizes bitmap index to seamlessly integrate Boolean conditions, clustering, and ranking. Experimental study shows that our approach significantly outperforms the straightforward one and maintains high clustering quality.