Mining of Massive Datasets

Authors:
Anand Rajaraman;Jeffrey David Ullman
Affiliations:
-;-
Venue:
Mining of Massive Datasets
Year:
2011

Citing 0
Cited 18

Cluster computing, recursion and datalog

Datalog'10 Proceedings of the First international conference on Datalog Reloaded
Fast sampling word correlations of high dimensional text data (abstract only)

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Transitive closure and recursive Datalog implemented on clusters

Proceedings of the 15th International Conference on Extending Database Technology
Designing good MapReduce algorithms

XRDS: Crossroads, The ACM Magazine for Students - Big Data
Fast near neighbor search in high-dimensional binary data

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Instance-Based matching of large ontologies using locality-sensitive hashing

ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I
Fast group recommendations by applying user clustering

ER'12 Proceedings of the 31st international conference on Conceptual Modeling
Bounds on lengths of real valued vectors similar with regard to the tanimoto similarity

ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part I
SkyDiver: a framework for skyline diversification

Proceedings of the 16th International Conference on Extending Database Technology
Not Every Friend on a Social Network Can be Trusted: An Online Trust Indexing Algorithm

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
Privacy-preserving multi-keyword text search in the cloud supporting similarity-based ranking

Proceedings of the 8th ACM SIGSAC symposium on Information, computer and communications security
Upper and lower bounds on the cost of a map-reduce computation

Proceedings of the VLDB Endowment
NIFTY: a system for large scale information flow tracking and clustering

Proceedings of the 22nd international conference on World Wide Web
Quasar: resource-efficient and QoS-aware cluster management

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Extending market basket analysis with graph mining techniques: A real case

Expert Systems with Applications: An International Journal
Cloud based real-time collaborative filtering for item-item recommendations

Computers in Industry
Analyzing analytics

ACM SIGMOD Record
Enhancing K-Means using class labels

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

The popularity of the Web and Internet commerce provides many extremely large datasets from which information can be gleaned by data mining. This book focuses on practical algorithms that have been used to solve key problems in data mining and which can be used on even the largest datasets. It begins with a discussion of the map-reduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing. The PageRank idea and related tricks for organizing the Web are covered next. Other chapters cover the problems of finding frequent itemsets and clustering. The final chapters cover two applications: recommendation systems and Web advertising, each vital in e-commerce. Written by two authorities in database and Web technologies, this book is essential reading for students and practitioners alike.