A comparison of p-dispersion heuristics
Computers and Operations Research
Aggregate-Query Processing in Data Warehousing Environments
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Batch is Back: CasJobs, Serving Multi-TB Data on the Web
ICWS '05 Proceedings of the IEEE International Conference on Web Services
ACM SIGMOD Record
Efficient diversity-aware search
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
On query result diversification
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Optimized processing of multiple aggregate continuous queries
Proceedings of the 20th ACM international conference on Information and knowledge management
Scalable diversification of multiple search results
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Data centric research at the University of Queensland
ACM SIGMOD Record
Hi-index | 0.00 |
Data diversification provides users with a concise and meaningful view of the results returned by search queries. In addition to taming the information overload, data diversification also provides the benefits of reducing data communication costs as well as enabling data exploration. The explosion of big data emphasizes the need for data diversification in modern data management platforms, especially for applications based on web, scientific, and business databases. Achieving effective diversification, however, is rather a challenging task due to the inherent high processing costs of current data diversification techniques. This challenge is further accentuated in a multi-user environment, in which multiple search queries are to be executed and diversified concurrently. In this paper, we propose the DoS scheme, which addresses the problem of scalable diversification of multiple search results. Our experimental evaluation shows the scalability exhibited by DoS under various workload settings, and the significant benefits it provides compared to sequential methods.