SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
On FastMap and the Convex Hull of Multivariate Data: Toward Fast and Robust Dimension Reduction
IEEE Transactions on Pattern Analysis and Machine Intelligence
SIAM Journal on Computing
Fast computation of low-rank matrix approximations
Journal of the ACM (JACM)
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Convex Non-negative Matrix Factorization in the Wild
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Convex and Semi-Nonnegative Matrix Factorizations
IEEE Transactions on Pattern Analysis and Machine Intelligence
Convex non-negative matrix factorization for massive datasets
Knowledge and Information Systems
Introduction to data mining for sustainability
Data Mining and Knowledge Discovery
Matrix factorization as search
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Hi-index | 0.00 |
Climate change, the global energy footprint, and strategies for sustainable development have become topics of considerable political and public interest. The public debate is informed by an exponentially growing amount of data and there are diverse partisan interest when it comes to interpretation. We therefore believe that data analysis methods are called for that provide results which are intuitively understandable even to non-experts. Moreover, such methods should be efficient so that non-experts users can perform their own analysis at low expense in order to understand the effects of different parameters and influential factors. In this paper, we discuss a new technique for factorizing data matrices that meets both these requirements. The basic idea is to represent a set of data by means of convex combinations of extreme data points. This often accommodates human cognition. In contrast to established factorization methods, the approach presented in this paper can also determine over-complete bases. At the same time, convex combinations allow for highly efficient matrix factorization. Based on techniques adopted from the field of distance geometry, we derive a linear time algorithm to determine suitable basis vectors for factorization. By means of the example of several environmental and developmental data sets we discuss the performance and characteristics of the proposed approach and validate that significant efficiency gains are obtainable without performance decreases compared to existing convexity constrained approaches.