Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Bigtable: A Distributed Storage System for Structured Data
ACM Transactions on Computer Systems (TOCS)
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Pfp: parallel fp-growth for query recommendation
Proceedings of the 2008 ACM conference on Recommender systems
Hadoop: The Definitive Guide
PLDA+: Parallel latent dirichlet allocation with data placement and pipeline processing
ACM Transactions on Intelligent Systems and Technology (TIST)
Scalable multimedia content analysis on parallel platforms using python
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Hi-index | 0.00 |
The Internet brings us access to multimedia databases with billions of data instances. The massive amount of data available to researchers and application developers brings both opportunities and challenges. In particular, massive amount of data makes data-driven approach feasible, but at the same time, it demands scalable algorithms. In this tutorial we present a range of algorithms and approaches that make it easy/easier to scale our work to Internet-sized collections of multimedia data. The tutorial will start by providing attendees an overview and pointers to the tools that will allow them to scale their work to massive datasets. The tutorial discusses the theoretical and practical problem with large data, applications where large amounts of data are important to consider, types of algorithms that are practical with such large datasets, and examples of implementation techniques that make these algorithms practical. Many real-world examples and results illustrate the tutorial.