Implementing data cubes efficiently
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
What can Hierarchies do for Data Warehouses?
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Modeling Multidimensional Databases, Cubes and Cube Operations
SSDBM '98 Proceedings of the 10th International Conference on Scientific and Statistical Database Management
Multidimensional databases: problems and solutions
Multidimensional databases: problems and solutions
Conceptual multidimensional models
Multidimensional databases
How to build a WebFountain: An architecture for very large-scale text analytics
IBM Systems Journal
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Hierarchical topic segmentation of websites
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Relaxation in text search using taxonomies
Proceedings of the VLDB Endowment
Interesting-phrase mining for ad-hoc text analytics
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
In earlier work, we defined "multi-structural databases," a data model to support efficient analysis of large, complex data sets over multiple numerical and hierarchical dimensions. We defined three types of queries over this data model, each of which required solving an optimization problem. An example is to find the ten most significant non-overlapping regions of geography crossed with time in which coverage of the Olympics was much stronger in newspapers than online sources.In this paper, we present a general query framework capturing the original three queries as part of a much broader family. We then give efficient algorithms for particular subclasses of this family. Finally, we describe an implementation of these algorithms that operates on a collection of several billion web documents. Using our algorithms in conjunction with random sampling techniques, our system can solve these queries in real time.