The data warehouse toolkit: practical techniques for building dimensional data warehouses
The data warehouse toolkit: practical techniques for building dimensional data warehouses
Coarse Grained Parallel On-Line Analytical Processing (OLAP) for Data Mining
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Hierarchies in a multidimensional model: from conceptual modeling to logical representation
Data & Knowledge Engineering - Special issue: WIDM 2004
Why we twitter: understanding microblogging usage and communities
Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis
Using twitter to recommend real-time topical news
Proceedings of the third ACM conference on Recommender systems
A Conceptual Model for Combining Enhanced OLAP and Data Mining Systems
NCM '09 Proceedings of the 2009 Fifth International Joint Conference on INC, IMS and IDC
Short and tweet: experiments on recommending content from information streams
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
TwitterMonitor: trend detection over the twitter stream
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Discovering users' topics of interest on twitter: a first look
AND '10 Proceedings of the fourth workshop on Analytics for noisy unstructured text data
Tweets from Justin Bieber's heart: the dynamics of the location field in user profiles
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
OLAPing social media: the case of Twitter
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Hi-index | 0.00 |
The standard approach to OLAP requires measures and dimensions of a cube to be known at the design stage. Besides, dimensions are required to be non-volatile, balanced and normalized. These constraints appear too rigid for many data sets, especially semi-structured ones, such as user-generated content in social networks and other web applications. We enrich the multidimensional analysis of such data via content-driven discovery of dimensions and classification hierarchies. Discovered elements are dynamic by nature and evolve along with the underlying data set. We demonstrate the benefits of our approach by building a data warehouse for the public stream of the popular social network and microblogging service Twitter. Our approach allows to classify users by their activity, popularity, behavior as well as to organize messages by topic, impact, origin, method of generation, etc. Such capturing of the dynamic characteristic of the data adds more intelligence to the analysis and extends the limits of OLAP.