An overview of data warehousing and OLAP technology
ACM SIGMOD Record
FileNet integrated document management database usage and issues
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Efficient clustering of high-dimensional data sets with application to reference matching
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
A foundation for capturing and querying complex multidimensional data
Information Systems - Data warehousing
The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing and Deploying Data Warehouses with CD Rom
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
STORM: A Statistical Object Representation Model
Proceedings of the 5th International Conference SSDBM on Statistical and Scientific Database Management
Summarizability in OLAP and Statistical Data Bases
SSDBM '97 Proceedings of the Ninth International Conference on Scientific and Statistical Database Management
Normal Forms for Multidimensional Databases
SSDBM '98 Proceedings of the 10th International Conference on Scientific and Statistical Database Management
Multidimensional normal forms for data warehouse design
Information Systems
Extending XQuery for analytics
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Capturing summarizability with integrity constraints in OLAP
ACM Transactions on Database Systems (TODS)
ICML '06 Proceedings of the 23rd international conference on Machine learning
Sampling Search-Engine Results
World Wide Web
On synopses for distinct-value estimation under multiset operations
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Towards keyword-driven analytical processing
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Objectrank: authority-based keyword search in databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Building structured web community portals: a top-down, compositional, and incremental approach
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
BlogScope: a system for online analysis of high volume text streams
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
DBPubs: multidimensional exploration of database publications
Proceedings of the VLDB Endowment
DBPubs: multidimensional exploration of database publications
Proceedings of the VLDB Endowment
Automating the loading of business process data warehouses
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Distinct-value synopses for multiset operations
Communications of the ACM - A View of Parallel Computing
Measure-driven keyword-query expansion
Proceedings of the VLDB Endowment
Text-to-query: dynamically building structured analytics to illustrate textual content
Proceedings of the 2010 EDBT/ICDT Workshops
Interesting-phrase mining for ad-hoc text analytics
Proceedings of the VLDB Endowment
Toward total business intelligence incorporating structured and unstructured data
Proceedings of the 2nd International Workshop on Business intelligencE and the WEB
Extracting dimensions for OLAP on multidimensional text databases
WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
Top-k interesting phrase mining in ad-hoc collections using sequence pattern indexing
Proceedings of the 15th International Conference on Extending Database Technology
Journal of Web Engineering
Faster upper bounding of intersection sizes
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Hi-index | 0.00 |
Content Management Systems (CMS) store enterprise data such as insurance claims, insurance policies, legal documents, patent applications, or archival data like in the case of digital libraries. Search over content allows for information retrieval, but does not provide users with great insight into the data. A more analytical view is needed through analysis, aggregations, groupings, trends, pivot tables or charts, and so on. Multidimensional Content eXploration (MCX) is about effectively analyzing and exploring large amounts of content by combining keyword search with OLAP-style aggregation, navigation, and reporting. We focus on unstructured data or generally speaking documents or content with limited metadata, as it is typically encountered in CMS. We formally present how CMS content and metadata should be organized in a well-defined multidimensional structure, so that sophisticated queries can be expressed and evaluated. The CMS metadata provide traditional OLAP static dimensions that are combined with dynamic dimensions discovered from the analyzed keyword search result, as well as measures for document scores based on the link structure between the documents. In addition, we provide means for multidimensional content exploration through traditional OLAP rollupdrilldown operations on the static and dynamic dimensions, solutions for multi-cube analysis and dynamic navigation of the content. We present our prototype, called DBPubs, which stores research publications as documents that can be searched and -most importantly-- analyzed, and explored. Finally, we present experimental results of the efficiency and effectiveness of our approach.